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METHOD FOR MANAGING FLOW BANDWIDTH UTILIZATION AT 
NETWORK, TRANSPORT AND APPLICATION LAYERS IN STORE 

AND FORWARi:^ NETWORIC 

COPYRIGHT NOTICE 
A portion of the disclosure of this patent document contains material which 
is subject to copyright protection. The copyright owner has no objection to the facsimile 
reproduction by anyone of the patent document or the patent disclosure as it appears in 
the Patent and Trademark Office patent file or records, but otherwise reserves all 
copyright rights whatsoever. 

CROSS-REFERENCE TO RELATED APPUCATIONS 
This application claims priority from the following U.S. Provisional 

Application: 

U.S. Provisional Patent Application Serial No. 60/032,485, Robert L. 
Packer, entitled, "Method for Managing Flow Bandwidth Utilization at Network, 
Transport and Application Layers in Store and Forward Network", filed December 9, 
1996. 

Further, this application makes reference to the following commonly 
owned U.S. Patent Application: 

Copending U.S. Patent Application Serial No. 08/762,828, in the name of 
Robert L. Packer, entitled "Method for Rapid Data Rate Detection in a Packet 
Communication Environment Without Data Rate Supervision," relates to a technique for 
automatically determining the data rate of a TCP connection. 

Further, this application makes reference to the following U.S. Patent 

Application: 

Copending U.S. Patent Application Serial No. 08/742,994, in the name of 
Robert L. Packer, entitled "Method for Explicit Data Rate Control m a Packet 
Communication Environment Without a Data Rate Supervision," relates to a technique 
for automatically scheduling TCP packets for transmission. 
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BACKGROUND OF THE INVENTION 

This invention relates to digital packet telecommunications, and particularly 
to management of network bandwidth based on information ascertainable from multiple 
layers of OSI network model. It is particularly useful in conjunction with data flow rate 
detection and control of a digitally-switched packet telecommunications environment 
normally not subject to data flow rate control. 

The ubiquitous TCP/IP protocol suite, which implements the world-wide 
data communication network environment called the Internet and is also used in private 
networks (Intranets), intentionally omits explicit supervisory function over the rate of data 
transport over the various media which comprise the network. While there are certain 
perceived advantages, this characteristic has the consequence of juxtaposing very high- 
speed packet flows and very low-speed packet flows in potential conflict for network 
resources, which results in inefficiencies. Certain pathological loading conditions can 
result in instability, overloading and data transfer stoppage. Therefore, it is desirable to 
provide some mechanism to optimize efficiency of data transfer while minimizing the risk 
of data loss. Early indication of the rate of data flow which can or must be supported is 
very useful. In fact, data flow rate capacity information is a key factor for use in 
resource allocation decisions. 

Internet/Intranet technology is based largely on the TCP/IP protocol suite, 
where IP, or Internet Protocol, is the network layer protocol and TCP, or Transmission 
Control Protocol, is the transport layer protocol. At the network level, IP provides a 
"datagram" delivery service. By contrast, TCP builds a transport level service over the 
datagram service to provide guaranteed, sequential delivery of a byte stream between two 
IP hosts. 

TCP flow control mechanisms operate exclusively at the end stations to 
limit the rate at which TCP endpoints emit data. However, TCP lacks explicit data rate 
control. In fact, there is heretofore no concept of coordination of data rates among 
multiple flows. The basic TCP flow control mechanism is a sliding window, 
superimposed on a range of bytes beyond the last explicitly-acknowledged byte. Its 
sliding operation limits the amount of unacknowledged transmissible data that a TCP 
endpoint can emit. 

Another flow control mechanism is a congestion wmdow, which is a 
refinement of the sliding window scheme, which employs conservative expansion to fully 
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utilize all of the allowable window. A component of this mechanism is sometimes 
referred 

to as "slow start". 

The sliding window flow control mechanism works in conjxmction with the 
Retransmit Timeout Mechanism (RTO), which is a timeout to prompt a retransmission of 
unacknowledged data. The timeout length is based on a running average of the Round 
Trip Time (RTT) for acknowledgment receipt, i.e. if an acknowledgment is not received 
within (typically) the smoothed RTT + 4*mean deviation, then packet loss is inferred and 
the data pending acknowledgment is retransmitted. 

Data rate flow control mechanisms which are operative end-to-end without 
explicit data rate control draw a strong inference of congestion from packet loss (inferred, 
typically, by RTO). TCP end systems, for example, will 'back-off, i.e., inhibit 
tninsmission in increasing multiples of the base RTT average as a reaction to consecutive 
packet loss. 

Bandwidth Management in TCP/IP Networks 

Conventional bandwidth management in TCP/IP networks is accomplished 
by a combination of TCP end systems and routers which queue packets and discard 
packets when certain congestion thresholds are exceeded. The discarded, and therefore 
unacknowledged, packet serves as a feedback mechanism to the TCP transmitter. (TCP 
end systems are clients or servers running the TCP transport protocol, typically as part of 
their operating system.) 

The term "bandwidth management" is often used to refer to link level 
bandwidth management, e.g. multiple line support for Point to Point Protocol (PPP). 
Link level bandwidth management is essentially the process of keeping track of all traffic 
and deciding whether an additional dial line or ISDN channel should be opened or an 
extraneous one closed. The field of this invention is concerned with network level 
bandwidth management, i.e. policies to assign available bandwidth from a single logical 
link to network flows. 

Routers support various queuing options. These options are generally 
intended to promote fairness and to provide a rough ability to partition and prioritize 
separate classes of traffic. Configuring these queuing options with any precision or 
without side effects is in fact very difficult, and in some cases, not possible. Seemingly 
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simple things, such as the length of the queue, have a profound effect on traffic 
characteristics. Discarding packets as a feedback mechanism to TCP end systems may 
cause large, uneven delays perceptible to interactive users. 

In a copending U.S. Patent Application Serial No. 08/742,994, in the 
name of Robert L. Packer, entitled "Method for Explicit Data Rate Control in a Packet 
Communication Environment Without Data Rate Supervision," a technique for 
automatically scheduling TCP packets for transmission is disclosed. Furthermore, in a 
copending U.S. Patent Application Serial No. 08/762,828, in the name of Robert L. 
Packer, entitled "Method for Rapid Data Rate Detection in a Packet Communication 
Environment Without Data Rate Supervision," a technique for automatically determining 
the data rate of a TCP connection is disclosed. While these patent applications teach 
methods for solving problems associated with scheduling transmissions and for 
automatically determining a data flow rate on a TCP connection, respectively, there is no 
teaching in the prior art of methods for explicitly managing TCP packet traffic based 
upon information about the flow's characteristics at multiple OSI protocol layers. 

Bandwidth management is heretofore not known to employ information 
contained in the packets corresponding to higher OSI protocol layers, even though such 
information may be extremely useful in making bandwidth allocation and management 
decisions. 

SUMMARY OF THE INVENTION 
According to the invention, in a packet communication environment, a 
method is provided for classifying packet network flows for use in determining a policy, 
or rule of assignment of a service level, and enforcing that policy by direct rate control. 
The method comprises applying individual instances of traffic objects, i.e., packet network 
flows to a classification model based on selectable information obtained from a plurality 
of layers of a multi-layered communication protocol, then mapping the flow to the 
defined traffic classes, which are arbitrarily assignable by an offline manager which 
creates the classification. It is useful to note that the classification need not be a complete 
enumeration of the possible traffic. 

In one aspect of the invention, bandwidth may be divided into arbitrary 
units, partitions, facilitating isolation and allocation. A partition is allocated for a class or 
set of traffic classes, carving the bandwidth of the associated link into multiple. 



wo 98/26510 PCT/US97/22550 

5 

independent pieces. 

In another aspect of the invention, available bandvsridth may be allocated 
among flows according to a policy, which may include any combination of guaranteed 
information rate, excess infonnation rate, the later allocated according to a priority. 

In another aspect of the invention, bandwidth resource needs of multiple 
heterogeneous requesting flows are reconciled with available bandwidth resources in 
accordance with policy of each flow based upon the flow's class. Flows requiring 
reserved service with guaranteed information rates, excess information rates or tmreserved 
service are reconciled with the available bandwidth resources continuously and 
automatically. 

In another aspect of the invention, providing an admissions policy which is 
invoked whenever a request for a bandwidth cannot be met consistently with other users 
of bandwidth. 

An advantage of network management techniques according to the present 
invention is that network managers need only define traffic classes which are of interest. 

A further advantage of the present invention is that traffic classes may 
include information such as a URI for web traffic. 

A yet further advantage of the present invention is that service levels may 
be defined in terms of explicit rates and may be scaled to a remote client or server's 
network access rate. Different service levels may be specified for high speed and low 
speed users. 

A yet further advantage of the present invention is that service levels may 
be defined in terms of a guaranteed minimum service level. 

The invention will be better understood upon reference to the following 
detailed description in connection with the accompanying drawings. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. lA depicts a representative client server relationship in accordance 
with a particular embodiment of the invention; 
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Fig. 1 B depicts a functional perspective of the representative client server 
relationship in accordance witli a particular embodiment of the invention; 

Fig. IC depicts a representative internetworking environment in accordance 
witli a particular embodiment of the invention; 
5 Fig. ID depicts a relationship diagram of the layers of the TCP/IP protocol 

suite; 

Fig. IE depicts a two dimensional representation of timing relationships in 
the exchange of packets between hosts using the TCP protocol; 

Figs. 2A-2B depict representative divisions of bandwidth according to a 
10 particular embodiment of the invention; 

Figs. 2C-2E are flow charts depicting process steps according to a 
particular embodiment of the invention; 

Fig. 3 is a block diagram of a particular embodiment according to the 

invention; 

15 Fig. 4A is a block diagram of a data structure according to a particular 

embodiment of the invention; 

Fig. 4B is a block diagram of data structure according to a particular 
embodiment of the invention; and 

Figs. 5A-5H are flow charts depicting process steps according to a 
20 particular embodiment of the invention. 

DETAILED DESCRIPTION OF SPECIFIC EMBODIMENTS 
A preferable embodiment of a flow bandwidth management system 
according to the invention has been reduced to practice and will be made available under 
25 the trade name "PacketShaper'". " 

1.0 Introduction 

The present invention provides techniques to manage network bandwidtti 
such as on a network access link between a local area network and a wide area network. 
30 Systems according to the present invention enable network managers to: defme traffic 

classes; create policies which defme service levels for traffic classes; and isolate 
bandwidth resources associated with certain traffic classes. Inbound as well as outbound 
traffic may be managed. Table 1 provides a definitional list of terminology used herein. 
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LIST OF DEFINITIONAL TERMS 

ADMISSIONS CONTROL A policy invoked whenever a system according to the 

invention detects that a guaranteed information rate cannot 
be maintained. An admissions control policy is analogous to 
a busy signal in the telephone world. 

CLASS SEARCH ORDER A search method based upon traversal of a N-ary tree data 

structure containing classes. 

COMMITTED INFORMATION 
RATE 

(CIR) A rate of data flow allocated to reserved service traffic for 

rate based bandwidth allocation for a committed bandwidth. 
Also called a guaranteed information rate (GIR). 

EXCEPTION A class of traffic provided by the user which supersedes an 

automatically determined classification order. 

EXCESS INFORMATION 
RATE 

(EIR) A rate of data flow allocated to reserved service traffic for 

rate based bandwidth allocation for uncommitted bandwidth 
resources. 

FLOW A flow is a single instance of a traffic class. For example, 

all packets in a TCP connection belong to the same flow. 
As do all packets in a UDP session. 



GUARANTEED 
INFORMATION RATE 
(GIR) 



A rate of data flow allocated to reserved service traffic for 
rate based bandwidth allocation for a committed bandwidth. 
Also called a conmiitted mformation rate (CIR). 
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Hard isolation results from the creation of an entirely 
separate logical channel for a designated set of classes. 



INSIDE 



On the system side of an access link. Outside clients and 
servers are on the other side of the access link. 



ISOLATION 



Isolation is the degree that bandwidth resources are allocable 
to traffic classes. 
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OUTSIDE 



On the opposite side of an access link as viewed from the 
perspective of the system on which the software resides. 



PARTITION 



Partition is an arbitrary unit of network resources. 



15 



POUCY 



A rule for the assignment of a service level to a flow. 



POLICY INHERITANCE A method for assigmng policies to flows for which no 

policy exists in a hierarchical arrangement of policies. For 
example, if a flow is determined to be comprised of FTP 
20 packets for Host A, and no corresponding policy exists, a 

policy associated with a parent node, such as an FTP policy, 
may be located and used. See also POLICY SEARCH 
ORDER. 



25 POUCY BASED 

SCALING 



An adjustment of a requested data rate for a particular flow 
based upon the policy associated with the flow and 
information about the flow's potential rate. 



30 



RESERVED SERVICE 



Reserved service is a service level intended for traffic which 
"bursts" or sends chunks of data. Reserved service is 
defined in terms of a scaled rate. 
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SCALED RATE 



Assignment of a data rate based upon detected speed. 



SERVICE LEVEL 



A service paradigm having a combination of characteristics 
defined by a network manager to handle a particular class of 
traffic. Service levels may be designated as either reserved 
or unreserved. 



10 



15 



SOFT ISOLATION 
TARGET RATE 



Restricting GIR allocated for traffic classes in a partition. 

A target rate is a combination of a guaranteed rate and an 
excess rate. Target rate is a policy-based paradigm. Excess 
rate is allocated by systems according to the invention from 
bandwidth that is not consumed by reserved service. 
Policies will demand excess rate at a given priority and 
systems according to the invention satisfy this demand by a 
priority level. 



20 



TRAFFIC CLASS 



All traffic between a client and a server endpoints. A single 
instance of a traffic class is called a flow. Traffic classes 
have properties or class attributes such as» directionality, 
which is the property of traffic to be flowing inbound or 
outbound. 



25 



UNRESERVED SERVICE Unreserved service is a service level defined in terms of 

priority in which no reservation of bandwidth is made. 



Table 1 

1-1 Hardware Overview 

The method for flow bandwidth management in a packet oriented 
30 telecommimications network environment of the present invention is implemented in the 

C programming language and is operational on a computer system such as shown in Fig. 
lA. This invention may be implemented in a client-server environment* but a client- 
server environment is not essential. This figure shows a conventional client-server 
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computer system which includes a server 20 and numerous clients, one of which is 
shown as client 25. The use of tlie term "server" is used in the context of the invention, 
wherein the server receives queries from (typically remote) clients, does substantially all 
the processing necessary to formulate responses to the queries, and provides these 
responses to the clients. However, server 20 may itself act in the capacity of a client 
when it accesses remote databases located at another node acting as a database server. 

The hardware configurations are in general standard and will be described 
only briefly. In accordance with known practice, server 20 includes one or more 
processors 30 which communicate with a number of peripheral devices via a bus 
subsystem 32. These peripheral devices typically include a storage subsystem 35, 
comprised of a memory subsystem 35a and a file storage subsystem 3Sb holding 
computer programs (e.g., code or instructions) and data, a set of user interface input and 
output devices 37, and an interface to outside networks, which may employ Ethernet, 
Token Ring, ATM, IEEE 802.3, ITU X.25, Serial Link Internet Protocol (SLIP) or the 
public switched telephone network. This interface is shown schematically as a **Network 
Interface" block 40. It is coupled to corresponding interface devices in client computers 
via a network connection 45. 

Client 25 has the same general configuration, although typically with less 
storage and processing capability. Thus, while the client computer could be a terminal 
or a low-end personal computer, the server computer is generally a high-end workstation 
or mainframe, such as a SUN SPARC server. Corresponding elements and subsystems 
in the client computer are shown with corresponding, but primed, reference nimierals. 

Bus subsystem 32 is shown schematically as a single bus, but a typical 
system has a number of buses such as a local bus and one or more expansion buses (e.g., 
ADB, SCSI, ISA, EISA, MCA, NuBus, or PCI), as well as serial and parallel ports. 
Network connections are usually established through a device such as a network adapter 
on one of these expansion buses or a modem on a serial port. The client computer may 
be a desktop system or a portable system. 

The user interacts with the system using interface devices 37' (or devices 
37 in a standalone system). For example, client queries are entered via a keyboard, 
conununicated to client processor 30*, and thence to modem or network interface 40* 
over bus subsystem 32*. The query is then communicated to server 20 via network 
connection 45. Similarly, results of the query are communicated from the server to the 



wo 98/26510 PCT/US97/22550 

11 

client via network connection 45 for output on one of devices 37* (say a display or a 
printer), or may be stored on storage subsystem 35'. 

Fig. IB is a functional diagram of a computer system such as that of Fig. 
lA. Fig. IB depicts a server 20, and a representative client 25 of a plurality of clients 
which may interact with the server 20 via the Internet 45 or any other communications 
method. Blocks to the right of the server are indicative of the processing steps and 
functions which occm: in the server's program and data storage indicated by blocks 35a 
and 35b in Fig. lA. A TCP/IP "stack" 44 works in conjunction with Operating System 
42 to communicate with processes over a network or serial connection attaching Server 
20 to Internet 45. Web server software 46 executes concurrently and cooperatively with 
other processes in server 20 to make data objects 50 and 51 available to requesting 
clients. A Common Gateway Interface (CGI) script 55 enables information from user 
clients to be acted upon by web server 46, or other processes within server 20. 
Responses to client queries may be returned to the clients in the form of a Hypertext 
Markup Language (HTML) docimient ou^uts which are then communicated via Internet 
45 back to the user. 

Client 25 in Fig. IB possesses software implementing functional processes 
operatively disposed in its program and data storage as indicated by block 35a' in Fig. 
lA. TCP/IP stack 44\ works in conjunction with Operating System 42* to communicate 
with processes over a network or serial connection attaching Client 25 to Internet 45. 
Software implementing the function of a web browser 46' executes concurrently and 
cooperatively with other processes in client 25 to make requests of server 20 for data 
objects 50 and 51. The user of the client may interact via the web browser 46* to make 
such queries of the server 20 via Internet 45 and to view responses from the server 20 
via Internet 45 on the web browser 46'. 

1.2 Network Overview 

Fig. IC is illustrative of the internetworking of a plurality of clients such 
as client 25 of Figs. 1 A and IB and a plurali^ of servers such as server 20 of Figs. lA 
and IB as described herein above. In Fig. IC, network 70 is an example of a Token 
Ring or frame oriented network. Network 70 links host 71, such as an IBM RS6000 
RISC workstation, which may be ranning the AIX operating system, to host 72, which is 
a personal computer, which may be running Windows 95, IBM OS/2 or a DOS operating 
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system, and host 73, which may be an IBM AS/400 computer, which may be running the 
OS/400 operating system. Network 70 is intemetworked to network 60 via a system 
gateway which is depicted here as router 75, but which may also be a gateway having a 
firewall or a network bridge. Network 60 is an example of an Ethernet network that 
interconnects host 61, which is a SPARC workstation, which may be running SUNOS 
operating system with host 62, which may be a Digital Equipment VAX6000 computer 
which may be running the VMS operating system. 

Router 75 is a network access point (NAP) of network 70 and network 60. 
Router 75 employs a Token Ring adapter and Efliemet adapter. This enables router 75 
to interface with tlie two heterogeneous networks. Router 75 is also aware of the Inter- 
network Protocols, such as ICMP ARP and RIP, which are described herein below. 

Fig. ID is illustrative of the constituents of the Transmission Control 
Protocol/Internet Protocol (TCP/IP) protocol suite. The base layer of the TCP/IP 
protocol suite is the physical layer 80, which defines the mechanical, electrical, 
functional and procedural standards for the physical transmission of data over 
commimications media, such as, for example, the network connection 45 of Fig. lA. 
The physical layer may comprise electrical, medianical or functional standards siich as 
whether a network is packet switching or frame-switching; or whether a network is based 
on a Carrier Sense Multiple Access/Collision Detection (CSMA/CD) or a frame relay 
paradigm. 

Overlying the physical layer is the data link layer 82. The data link layer 
provides the function and protocols to transfer data between network resources and to 
detect errors that may occur at the physical layer. Operating modes at the datalink layer 
comprise such standardized network topologies as IEEE 802.3 Ethernet, IEEE 802.5 
Token Ring, ITU X.25, or serial (SUP) protocols. 

Network layer protocols 84 overlsr^ the datalink layer and provide the 
means for establishing connections between networks. The standards of network layer 
protocols provide operational control procedures for intemetworidng communications and 
routing information through multiple heterogenous networks. Examples of network layer 
protocols are the Internet Protocol (IP) and the Internet Control Message Protocol 
(ICMP). The Address Resolution Protocol (ARP) is used to correlate an Internet address 
and a Media Access Address (MAC) for a particular host. The Routing Information 
Protocol (RIP) is a dynamic routing protocol for passing routing information between 
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hosts on networks. The Internet Control Message Protocol (ICMP) is an internal 
protocol for passing control messages between hosts on various networks. ICMP 
messages provide feedback about events in the network environment or can help 
determine if a path exists to a particular host in the network environment. The latter is 
called a "Ping", The Internet Protocol (IP) provides the basic mechanism for routing 
packets of information in the Internet. IP is a non-reliable conmmnication protocol. It 
provides a "best efforts" delivery service and does not commit network resources to a 
particular transaction, nor does it perform retransmissions or give acknowledgments. 

The transport layer protocols 86 provide end-to-end transport services 
across multiple heterogenous networks. The User Datagram Protocol (UDP) provides a 
connectionless, datagram oriented service which provides a non-reliable delivery 
mechanism for streams of information. The Transmission Control Protocol (TCP) 
provides a reliable session-based service for delivery of sequenced packets of information 
across tlie Internet. TCP provides a connection oriented reliable mechanism for 
information delivery. 

The session, or application layer 88 provides a list of network applications 
and utilities, a few of which are illustrated here. For example. File Transfer Protocol 
(FTP) is a standard TCP/IP protocol for transferring files from one machine to another. 
FTP clients establish sessions through TCP connections with FTP servers in order to 
obtain files. Telnet is a standard TCP/IP protocol for remote terminal connection. A 
Telnet client acts as a terminal emulator and establishes a connection using TCP as the 
transport mechanism with a Telnet server. The Simple Network Management Protocol 
(SNMP) is a standard for managing TCP/IP networks. SNMP tasks, called "agents", 
monitor network status parameters and transmit these status parameters to SNMP tasks 
called "managers. " Managers track the status of associated networks. A Remote 
Procedure Call (RFC) is a programming interface which enables programs to invoke 
remote functions on server machines. The Hypertext Transfer Protocol (HTTP) 
facilitates the transfer of data objects across networks via a system of uniform resource 
indicators (URI). 

The Hypertext Transfer Protocol is a simple protocol built on top of 
Transmission Control Protocol (TCP). It is the mechanism which underlies the function 
of the World Wide Web. The HTTP provides a method for users to obtain data objects 
from various hosts acting as servers on the Internet. User requests for data objects are 
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made by means of an HTTP request, such as a GET request. A GET request as depicted 
below is comprised of 1) an HTTP protocol version* such as "http:/1.0"; followed by 2) 
the fiiU path of the data object; followed by 3) the name of the data object. In the GET 
request shown below, a request is being made for the data object with a path name of 
"/pub/" and a name of "MyData.html": 

HTTP-Version GET /pub/MyData.html (1) 

Processing of a GET request entails the establishing of an TCP/IP 
connection with the server named in the GET request and receipt from the server of the 
data object specified. After receiving and interpreting a request message, a server 
responds in the form of an HTTP RESPONSE message. 

Response messages begin with a status line comprising a protocol version 
followed by a numeric Status Code and an associated textual Reason Phrase. These 
elements are separated by space characters. The format of a status line is depicted in 
line (2): 

Statos-Line = HTTP- Version Status-Code Reason-Phrase (2) 

The status line always begins with a protocol version and status code, e.g., "HTTP/1.0 
200 The status code element is a three digit integer result code of the attempt to 
understand and satisfy a prior request message. The reason phrase is intended to give a 
short textual description of the status code. 

The first digit of the status code defines the class of response. There are five 
categories for the first digit. IXX is an mfonnation response. It is not currently used. 
2XX is a successful response, indicating that the action was successfiilly received, 
understood and accepted. 3XX is a redirection response, indicating that further action 
must be taken in order to complete the request 4XX is a client error response. This 
indicates a bad syntax in the request. Finally, 5XX is a server error. This indicates that 
the server failed to fulfill an apparently valid request. 

Particular formats of HTTP messages are described in, Crocker, D., 
"Standard for the Format of ARPA Internet Text Messages", STD 11, RFC 822, UDEL. 
August 1982. 
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1.3 TCP Data Link Rate and Latency Period Determination 

Techniques for determining data rates are more ftiUy set forth in co-owned 
U,S. Patent Application Serial No. 08/762,828, entitled "Method for Rapid Data Rate 
Detection in a Packet Communication Environment Without Data Rate Supervision". 

Fig. IE depicts a two-dimensional timing diagram illustrative of one 
particular method for determining a data link rate and latency period from an exchange 
of packets between TCP endpoints. According to this method, the initial data packets are 
examined as they establish a connection. Parameters are developed from which round 
trip time (RTT) and maximum data rate can be determined. The serialization speed (SS) 
or data flow rate capacity of a link is given by the relation: 

SS=m/Ti (3) 

where: 

m = number of bytes in the first data packet (Data!) 

Ti = The arrival time of the first data packet less the time of arrival of the ACK 
packet (T^i -T^ck) 

Fig. IE shows a two dimensional timing diagram of the connection 
protocol of a remote HTTP request between a local TCP endpoint (server) and a remote 
TCP endpoint (client). 'The remote endpoint issues a request for connection in the form 
of a SYN packet in step A. The SYN packet takes a finite but unknown transit time to 
arrive at the local TCP endpoint. The local TCP endpoint responds by sending its own 
SYN packet in step B. This SYN packet is of a known byte length and is issued at a 
known time, which becomes the reference time, "tbase." After a brief latency, the 
remote TCP endpoint issues a standard ACK packet, whose length is likewise known, in 
step C, and then also issues the first data packet datal in step D. 

Time Tj is computed immediately at the time of arrival of datal by 
examining the difference in arrival time of the received ACK packet and the datal 
packet. The value of m is extracted by examining the entire first packet to determine its 
length or by examining the packet length field, if any, in the header information of the 
packet. A first estimate of serialization speed SS is given by Equation (3). Serialization 
speed can be estimated immediately upon interchange of the first packets and used to 
make important strategic decisions about the nature and speed of the connection about to 
be established. 
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An alternative method for determining a data link rate and latency period 
from an exchange of packets between TCP endpoints develops parameters from which 
round trip time (RTT) and maximum data rate can be determined. Values are obtained 
for serialization of n, the size (i.e., data length) of the SYN packet in response, plus the 
size (i.e., data length) of the ACK packet. Serialization time is determined according to 
equation (4) by dividing the value n by the transit time Tq, or 

SS(max possible) = n/To (4) 

where 

n = number of bytes in the SYN packet plus the number of bytes in the ACK 
packet and 

To = the arrival time of the ACK packet less the tbase value. 

The round trip time minus the serialization time is the round trip latency 

2.0 Traffic Class 

A traffic class is broadly defined as traffic between one or more clients 
and one or more servers. A single instance of a traffic class is called a fiow. Traffic 
classes have the property, or class attribute, of being directional, i.e. all traffic flowing 
inbound will belong to different traffic classes and be managed separately from traffic 
flowing outbound. The directional property enables asymmetric classification and control 
of traffic, i.e., inbound and outboimd flows belong to different classes which may be 
managed independent of one another. 

Traffic classes may be defined at any level of the TCP/IP protocol. For 
example, at the IP level, traffic may be defined as only those flows between a set of 
inside and outside IP addresses or domain names. An exanq>le of such a low level traffic 
class definition would be all traffic between my netwoiic and other corporate offices 
throughout the Internet. At the application level, traffic classes may be defined for 
specific URIs within a web server. Traffic classes may be defined having "Web aware" 
class attributes. For example, a traffic class could be created such as all URIs matching 
"*.html" for all servers, or all URIs matching "*.gir for server X, or for access to 
server Y with URI Vsales/?^" from client Z, wherein **' is a wildcard character, i.e., a 
character which matches all other character combinations. Traffic class attributes left 
unspecified will simply match any value for that attribute. For example^ a traffic class 
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that accesses data objects within a certain directoxy path of a web server is specified by a 
URI of the directory path to be managed, e.g. "/sales/*". 

2-1 Classifying Traffic 

The present invention provides a method for classifying traffic according 
to a definable set of classification attributes selectable by the manager, including 
selecting a subset of traffic of interest to be classified. The invention provides the ability 
to classify and search traffic based upon multiple orthogonal classification attributes. 
Traffic class membership may be hierarchical. Thus, a flow may be classified by a 
series of steps through a traffic class tree, with the last step (i.e., at the leaves on the 
classification tree) mapping the flow to a policy. The policy is a rule of assignment for 
flows. For example, the first step in classification may be to classify a flow as web 
traffic, the next may further classify this flow as belonging to server X, and the final 
classification may be a policy for URI "*.avi". 

A classification tree is a data structure representing the hierarchical aspect 
of traffic class relationships. Each node of the classification tree represents a class, and 
has a traffic specification, i.e., a set of attributes or characteristics describing the traffic, 
and a mask associated with it. Leaf nodes of the classification tree contain policies. 
According to a particular embodiment, the classification process checks at each level if 
the flow bemg classified matches the attributes of a given traffic class. If it does, 
processing continues down to the links associated with that node m the tree. If it does 
not. the class at the level that matches determines the policy for the flow being classified. 
If no policy specific match is found, the flow is assigned the default policy. 

In a preferable embodiment, the classification tree is an N-aiy tree with its 
nodes ordered by specificity. For example, in classifying a particular flow in a 
classification tree ordered first by organizational departments, the attributes of the flow 
are compared with the traffic specification in each successive department node and if no 
match is found, then processing proceeds to the next subsequent departm^ node. If no 
match is found, then the final compare is a default "match all** category. If, however, a 
match is found, then classification moves to the children of this departmoit node. The 
child nodes may be ordered by an orthogonal paradigm such as, for example, "service 
type." Matching proceeds according to the order of specificity in the child nodes. 
Processing proceeds in this manner, traversing downward and frx)m left to right in the 
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classification tree, searching the plurality of orthogonal paradigms. Key to implementing 
this a hierarchy is that the nodes are arranged in decreasing order of specificity. This 
permits search to find the most specific class for the traffic before more general. 

Table 2 depicts components from which Traffic classes may be built. 
Note that the orientation of the server (inside or outside) is specified. And as noted 
above, any traffic class component may be unspecified, i.e. set to match any value. 
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Traffic Class Components 



Client Side 



Server Side 
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IP Address/Domain Name IP Address/Domain Name 



TCP or UDP Service, e.g. WWW, FTP. RealAudio, 
etc. 

URI for Web Service, e.g. "*.html", "*.gir, Vsales/*", 
etc. 



20 Table 2 

Figs. 2 A and 2B depict representative allocations of bandwidth made by a 
hypothetical network manager as an example. In Fig. 2A, the network manager has 
decided to divide her network resources first by allocating bandwidth between 

25 Departments A and B. Fig 2 A shows the resulting classification tree, in which 

Department A bandwidth resources 202 and Department B bandwidth resources 204 each 
have their own nodes representing a specific traffic class for that department. Each 
traffic class may have a policy attribute associated with it. For example, in Fig. 2A, the 
Department A resources node 202 has the policy attribute Inside Host Subnet A 

30 associated with it. Next, the network manager has chosen to divide the bandwidth 

resoxirces of Department A among two applications. She allocates an FTP traffic class 
206 and a World Wide Web server traffic class 208. Each of these nodes may have a 
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separate policy attribute associated with them. For example, in Fig. 2A, the FTP node 
206 for has an attribute Outside port 20 associated with it. Similarly, the network 
manager has chosen to divide network bandwidth resources of Department B into an FTP 
server traffic class 210 and a World Wide Web server traffic class 212. Each may have 
their own respective policies. 

Fig. 2B shows a second example, wherein the network manager has chosen to 
first divide network bandwidth resource between web traffic and TCP traffic. She 
creates three traffic nodes, a web traffic node 220. a TCP traffic node 224 and a default 
node 225. Next, she divides the web traffic among two organizational departments by 
creating a Department A node 226, and a Department B node 228. Each may have its 
own associated policy. Similarly, she divides TCP network bandwidth into separate 
traffic classes by creating a Department A node 230 and a Department B node 232. 
Each represents a separate traffic class which may have its own policy. 

All traffic which does not match any user specified traffic class falls into an 
automatically created default traffic class which has a default policy. In a preferable 
embodiment, the default policy treats default class flows as unreserved traffic at the 
default (medium) priority. In Fig 2A, the default category is depicted by a default node 
205, and in Fig. 2B, the default category is depicted by a default node 225. 

2.2 Creating Policies 

2.2.1 Reserved vs. Unreserved 

Network managers create policies in order to assign service levels to traffic 
classes which are of particular interest. Service levels may be either reserved or 
unreserved. Reserved service is useful in instances where traffic 'bursts', i.e., sends 
chunks of data, or in interactive applications, which require a minimum bandwidth in 
order to function properly. 

Reserved service is defined in terms of scaled rate, unreserved service is defined 
in terms of priority. Allocation of bandwidth resources by scaled rate is acconq>lished in 
a process known as "speed scaling" wherein the speed of the underlying network link 
upon which the flow is carried is detected and available bandwidth is allocated based 
upon this detected speed. Allocation of bandwidth resources for unreserved service is 
priority based. The total bandwidth resources for unreserved service demanded by all 
flows is satisfied according to a prioritized scheme, in which highest priorities are 



wo 98/26510 PCTAJS97/22550 

20 

satisfied first, followed by successively lower priority applications. Unreserved service 
allocations are based upon the available bandwidth after reserved service demands have 
been satisfied. 

Traffic may be a combination of reserved and unreserved services. For 
example, web traffic may have a service level defined at a web server for "*.html" to 
designate these types of files as traffic allocated unreserved service, as these are typically 
small files. In the same web server, "*.gir files may be designated as a separate traffic 
class receiving reserved service. TCP connections and UDP sessions may be defined for 
reserved service. 

2.2.2 Reserved Service: GIR vs. EIR 

Reserved service may have both a guaranteed information rate (GIR) and an 
excess information rate (EIR) components. Guaranteed information rate represents a 
commitment of bandwidth resources to maintain a specified rate. Excess information 
rate is an allocation of bandwidth resources on an "as available" basis. Guaranteed 
information rate prevents evident lack of progress situations, such as stuck progress 
indicators, from occurring. Excess rate is allocated firom available bandwidth, i.e. that 
which has not consumed by providing guaranteed rate service. As flows demand excess 
information rate at different priority levels, their demands are satisfied in order of 
priority level. For example, in an application having a ten voice over IP sessions, each 
session requiring a minimum of eight kilobits of bandwidth rate which must be 
guaranteed and up to 8 kilobits of additional bandwidth rate which may be used from 
time-to-time, would be allocated reserved service comprised of eight kilobits of 
guaranteed information rate for each of the ten sessions. Then, from the remaining 
available bandwidth resources, eight kilobits of excess inforaoation rate is allocated to 
each session based upon the priority of each session, until either all demands have been 
satisfied, or bandwidth resources are completely consumed. 

When a guaranteed infomtiation rate cannot be provided to a requesting flow due 
to insufficient bandwidth resources, an admissions policy is invoked. The admissions 
policy may refuse a connection, display an error message or a combination of both. 
Admissions policies prevent service 'brown outs* which would otherwise occur if 
guaranteed information rate demand exceeds available bandwidth resources. By contrast, 
excess information rate requirements are not guaranteed. If a flow's request for excess 
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infoimauon rate is not able to be satisfied from the available bandwidth resources, it is 
simply not met. In the previous example, if insufficient bandwidth existed in order to 
acconmiodate the ten voice over IP sessions, an admissions policy would be invoked. 

Excess information rate (EIR) is redistributed each time a new flow is created, 
an existing flow is terminated, or a substantial change in demand for unreserved priority 
level occurs. By contrast. Guaranteed information rate (GIR) is allocated only at the 
time of creation of a flow and freed when the flow terminates. 

In a preferable embodiment, reserved TCP service levels may be enforced by 
TCP Rate Control for both inbound and outbound traffic. TCP Rate Control is a 
technique well known to persons of ordinary skill in the art which scales reserved flows 
as excess rate is available without incurring retransmission. 

2.3 Isolating Bandwidth Resources for Traffic Classes 

Bandwidth resources may be dedicated to certain traffic classes. There are two 
degrees of isolation. Hard isolation creates an entirely separate logical channel for 
designated traffic classes. For example, department A and department B could both be 
hard isolated, setting up two logical .75 Mbps channels on a Tl access link, one for each 
department. With hard isolation, unused bandwidth in either channel would not be 
shared with the other department's traffic. 

Hard isolation involves allocating arbitrary subsets of access link bandwidth for 
any traffic class or family of traffic classes. It completely protects the traffic belonging 
to the respective class or classes, but this protection comes at the cost of forsaken 
opportunities to share unused bandwidth. 

Soft isolation allows sharing of unused bandwidth, but limits the amount of 
guaranteed bandwidth that can be allocated to any particular traffic class at any one time. 
This provides a basic level of fairness between traffic classes which have corresponding 
guaranteed service policies. But this fairness does not come at the cost of efficiency 
gains accrued from sharing available bandwidth. 

3.0 System Function 

3.1 Building a Traffic Classification Tree 

Fig. 2C depicts a flowchart 203 showing the process steps in building a 
classification tree, such as classification tree 201 of Fig. 2A, by adding traffic class 
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nodes, such as traffic class node 206, in order of specificity of classes. In a step 240, a 
new tclass node is allocated for the traffic class which is to be added to the classification 
tree. Next, in a step 242, a traffic specification is inserted for the new traffic class node, 
allocated in step 240. Tlien, in a step 244, the new tclass node is inserted into the 
classification tree. Next, in a step 246, bandwidth resources allocated based upon the 
traffic class hierarchy is redistributed to reflect the new hierarchy. 

Fig. 2D depicts a flowchart 205 showing the component steps of traffic 
specification creation step 242 of Fig. 2C. The processing steps of flowchart 205 insert 
a 

traffic specification, or tspec, into the traffic class allocated in step 240. In a step 250, a 
current tspec is initialized to the first tspec in the tclass. Next, in a decisional step 252, 
a determination is made whether the new tspec is less specific than the current tspec, 
initialized in step 250. If this is not the case, then in a step 258, the new tspec is 
inserted after the current tspec, and processing returns. Otherwise, in a decisional step 
254, a determination is made whether this is the final tspec in the traffic class. If this is 
so, then in step 258, the new tspec is inserted after the last (current) tspec. Otherwise, 
in a step 256, the next tspec in the traffic class is examined by repeating processing steps 
252. 254, 256 and 258. 

Fig. 2E depicts a flowchart 207 showing the component steps of traffic 
specification creation step 244 of Fig. 2D. The processing steps of flowchart 207 insert 
the traffic class, or tclass, allocated in step 240 into the classification tree. In a step 260, 
a current sibling is initialized to the first sibling in the parent tclass. Next, in a 
decisional step 262, a determination is made whether the new tclass is less specific than 
the current sibling, initialized in step 260. If this is not the case, then in a step 268, the 
new tclass is inserted after the current sibling, and processing returns. Otherwise, in a 
decisional step 264, a determination is made whether this is the final sibling in the parent 
tclass. If this is so, then in step 268, the new tclass is inserted after the last (current) 
tclass. Otherwise, in a step 266, the next sibling in the parent traffic class is examined 
by repeating processing steps 262, 264, 266 and 268. 

3.2 Allocating Bandwidth ba sed upon Policy 

Fig. 3 depicts a functional block diagram 301 of component processes in 
accordance with a particular embodiment of the present invention. Processing of a new 
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flow, such as new flow 300 as described by a flowchart 501 in Fig. 5A, begins with a 
TCP autobaud component 302, which implements the automatic detection of flow speeds 
as depicted in Fig. IE. TCP autobaud component 302 determines values for selectable 
information such as flow data rate of new flow 300. A classifier 304 determines the 
class for the new flow using the information determined by TCP autobaud 302. Having 
classifled the trafflc, the classifler next determines an appropriate policy for this 
particular flow from a classification tree (not shown). Based upon the policy, the 
classifler then determines a service level for the new flow. A bandwidth manager 306 
uses the policy determined by classifier 304 in order to allocate bandwidth according to 
the service level prescribed by the policy. 

Fig. 4A depicts data structures in a particular embodiment according to the 
invention. The bandwidth management aspect of the present invention is performed 
between inside clients and servers and the access link. Outside clients and servers are on 
the other side of the access link. In the embodiment of Fig. 4A, Traffic is directional. 
Inbound (flowing from an outside host to an inside host) traffic is managed entirely 
separately from outbound traffic. 

A connection hash table 402 facilitates indexing into a pliurality of TCP 
connections, each TCP connection having an associated flow, such as flows 300, 299 and 
298 of Fig. 3. A plurality of TCP control blocks 404 are used to track connection- 
related information for each TCP connection. Host-related information is tracked using a 
plurality of host information control blocks 406, indexed by a host hash table 408. 
Information about particular hosts corresponding to TCP connections is tracked using the 
TCP control block 404. 

Each TCP connection represented by TCP control block 404 has associated with 
it a gear 410 representing an outboimd flow and a gear 411 representing an inboimd 
flow. Each gear is an internal data object associated with a particular TCP connection or 
UDP session. Gears are protocol independent and contain rate-based information. Since 
network resources are managed separately for inbound and outbound traffic in this 
particular embodiment, there is a bandwidth pool 412 for outbound traffic and a 
bandwidth pool 413 for inbound traffic. Bandwidth pools represent collections of 
available network resources managed by bandwidth manager 306. Bandwidth pools are 
arbitrarily divided into a partition 414 in the outbound path, and a partition 415 in the 
inbound path. Each partition represents an arbitrarily-sized portion of available network 
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resource. Gears become associated with particular partitions during the beginning of 
flow management for the new flow in step 518 of Fig. 5 A. A policy 416 in the 
outbound path and a policy 417 in the inbound path become associated with partition 414 
in the outbound path and partition 415 in the inboimd path during the beginning of 
management of a new flow, as shown in Fig. 5A. These policies are associated with 
particular flows according to the trafflc class that classifier 304 assigns to the particular 
flow. 

Fig. 5A depicts flowchart 501 of the steps taken in initiating a new flow in 
accordance with a particular embodiment of the invention. In a step 502, TCP autobaud 
component 302 detects a TCP data rate of new flow 300. In a step 504, classifier 304 
classifies the new flow to determine a policy. Classiflcation is described in greater detail 
in Fig. 5F herein below. In a step 506, based upon the policy determined in step 504, a 
guaranteed information rate (GIR) and an excess information rate (EIR) are determined 
and speed scaled to the incoming TCP data rate, detected in step 502. Speed scaling is 
described in greater detail in Figs. 5G and 5H herein below. In a step 508, bandwidth 
manager 306 allocates guaranteed information rate resources and excess information rate 
resources to accommodate an initial target rate for the new flow. In a decisional step 
510, bandwidth manager 306 determines if a partition limit would be exceeded by the 
allocation determined in step 508. If this is the case, then in a step 512, an admissions 
policy is invoked. Otherwise, in a step 514, the excess information rate resources 
demanded by new flow 300 is added to a total excess information rate resources 
demanded for this bandwidth pool. In a step 516, excess information rate resources are 
redistributed with respect to all flows within the partition, if demand has changed 
substantially since the last time it was redistributed. Fig. 5E depicts a flowchart 509 
showing the steps in reapplying demand satisfaction parameters to all traffic flows in the 
bandwidth pool. A new target rate and total excess information rate resources are 
calculated. Finally, in a step 518, the new flow is associated with a partition. 

Fig. 5B depicts a flowchart 503 showing processing of a packet belonging to an 
existing flow such as a forward flow packet 299. In a decisional step 520, a scheduler 
308, using policy information determined at flow initiation, filters TCP acknowledgments 
for scheduling in a step 522. Output times are scheduled on millisecond boundaries, 
based upon the target data rate computed in step 508 of Fig. 5A. In as step 524, the 
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remaining packets are delivered immediately to TCP, which perfonns just in time 
delivery. 

Fig. SC depicts a flowchart 505 showing processing by bandwidth manager 306 
of a packet at the termination of a flow, such as an end flow packet 298, In a step 530, 
guaranteed information rate resources are restored. In step 532, excess information rate 
resources demanded by the terminating flow 298 is subtracted from the total excess 
information rate resources demanded. In a step 534, excess information rate resources 
are redistributed as described in Fig. 5D. Finally, in a step 536, the flow is removed 
from its partition. 

Fig. 4B depicts specific data structures used in redistributing unreserved service 
resources among multiple flows in a particular partition. Ii^ut rates of particular flows, 
such as new flow 300, are measured by TCP autobaud component 302 and used to 
determine an Extrapolated Unreserved Demand or (EUD) 424. Flows requiring excess 
information rate resources, such as flows 299 and 298, having gears 410 and 411 (not 
shown), respectively, have associated with them an individual flow demands 420 and 
421. The individual flow demands of all flows in a particular partition are added 
together to form an Aggregate Rate Demand (ARD) 422 at each priority level. 
Aggregate rate demand is added to extrapolated unreserved demand 424 to arrive at a 
total instantaneous demand 426. The total instantaneous demand is satisfied from the 
highest priority, priority 7, downward based upon the available excess information rate 
resources in the partition 414. This available rate is information rate which has not been 
used to satisfy guaranteed information rate demands. From the available excess 
information rate resources, a satisfaction percentage vector 428 is computed having the 
percentage of each priority of flow which may be satisfied in the total demand vector 426 
by the available excess information rate resources in partition 414. Individual flow 
demand vectors 420 and 421 are each multiplied by the satisfaction percentage for each 
priority 428 in order to arrive at a target rate which is incorporated in gears 410 and 411 
corresponding to each flow. The satisfaction percentage vector 428 is also applied to the 
Extrapolated Unreserved Demand (EUD) 424 to determine unreserved targets. 

3.3 Reconciling Needs for Bandwidth 

Fig. 5D depicts flowchart 507 of the processing steps to redistribute excess 
information rate resources with respect to reserved service flows within a given partition. 
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In a step 540, a demand satisfaction metric is computed for the bandwidth pool being 
redistributed. In a decisional step 542, a determination whether the demand satisfaction 
metric computed in step 540 has changed substantially since the last time demand was 
redistributed. If this is the case, then in a decisional step 544, a loop is entered for each 
gear associated with the partition owning the bandwidth pool being redistributed. In a 
step 546, demand satisfaction parameters are res^plied a traffic flow associated with each 
gear and a new target rate and a new total excess information rate resources are 
calculated. Processing continues until in decisional step 544, it is determined that all 
gears have been processed. 

Fig. 5E depicts flowchart 509 of the processing steps for reconciling bandwidth 
among a plurality of flows, requiring reserved service and unreserved service resources. 
For example, partition 414 of Fig. 4B has unreserved service flow 300 and reserved 
service flows 299 and 298. In a decisional step 550, a loop is formed with a step 554, in 
which the individual flow demands of reserved service flows, such as flows 299 and 298 
in partition 414, are added together to form an aggregate rate demand (ARD) 422. Next, 
in a step 552, an extrapolated unreserved demand (EUD) 424 is determined from the 
iiq>ut rate of unreserved service traffic 300 in partition 414. In a preferable embodiment, 
the EUD must be inflated in order to promote growth in traffic. This is done to 
acconmiodate the elastic nature of TCP traffic. Next, in a step 556, the EUD determined 
in step 552 and the ARD determined in step 554 are combined to form an instantaneous 
total demand (TD) 426 for the entire partition 414. The total instantaneous demand is 
satisfied from the highest priority level, level 7, downward based upon the available 
unreserved resources in the partition 414. In a step 558, a satisfaction percentage vector 
(SP) 428 is computed for the partition from the total demand vector (TD) 426 determined 
in step 556. The satisfaction percentage indicates that percentage of the demand at each 
priority level that can be satisfied from the unreserved resources in the partition. Next, 
in a step 560, a target rate for reserved rate based flows, such as flows 299 and 298, is 
computed. Finally, in a step 562, a target rate for unreserved priority based flows, such 
as flow 300, is computed. 



3.4 Searching a C lassification Tree 

Fig. 5F depicts a flowchart 511 showing the component steps of traffic 
classiflcation step 504 of Fig. 5 A. The processing steps of flowchart 511 determine a 
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class and a policy for a flow, such as new flow 300, by traversing a classification tree 
such as the classification tree 201 in Fig. 2A. Processing begins with a first node in the 
classification tree 201, such as department A node 202. In a step 570, a traffic 
specification of a child node, such as FTP node 206 is compared with a flow 
specification of the new flow 300. In a decisional step 572, if a match is discovered, 
then in a step 574, the processing of flowchart 511 is applied to the child node 206 
recursively. Otherwise, if child node 206 does not match the new flow, then in a 
decisional step 576, processing determines whether any sibling nodes exist for the child 
node 206. If processing detects the existence of a sibling of child node 206, such as 
child node 208, then processing continues on node 208 with step 570. Otherwise, if 
there are no further child nodes for node 202, then in a decisional step 578, a 
determination is made whether the node is a leaf node having a policy. If no policy 
exists, then in a step 580, processing backtracks to a parent node and looks for a policy 
associated with the parent node to apply to the new flow 300. Otherwise, if a policy is 
associated with the node 202, then in a step 582, the policy is associated with the new 
flow 300. 

3.5 Speed Scaling 

Fig. 5G depicts a flowchart 513 showing the component steps of speed scaling 
step 506 of Fig. 5 A. The processing steps of flowchart 513 determine an acceptable 
allocation of bandwidth to reserved service flows, i.e., GIR and EIR, by allocating rate 
for each gear, such as gears 410 and 411, based upon individual flow demands, base 
limits and total limits. In a step 582, scaled rate for GIR is set based upon a connection 
speed detected by TCP autobaud component 302 for a particular flow, such as flow 299, 
having a gear, such as gear 410. Then in a step 584, scaled rate for EIR is set based 
upon a connection speed detected by TCP autobaud component 302 for a particular flow, 
such as flow 298, having a gear, such as gear 411. Next, in a step 586, a limit is 
computed from either the available EIR remaining after the allocation to flow 299 in step 
584, or from a total limit, if such a limit is specified. Then, m a step 588, a 
determination is made whether any extra EIR resource remains after the limit computed 
in step 586 has been satisfied. Then, in a step 590, limits computed in step 586 are 
allocated to flows, such as flow 299 and flow 298, in order of descending priority level. 
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In a decisional step 592, the processing of steps 582 through 592 arc applied to 
subsequent gears, if any. 

Fig. 5H depicts a flowchart 515 showing the component steps of scaled rate 
setting steps 582 and 584 of Fig. 5G. The processing steps of flowchart 515 detetmine 
whether the rate of the underlying link detected by TCP autobaud 302 exceeds limits 
imposed on the network. In a decisional step 592, a determination is made whether the 
detected speed of a flow is less than a base limit minimum. If this is so, then in a step 
594, the base limit is returned as the scaled rate. Otherwise,, in a decisional step 596, a 
determination is made whether the total rate of all flows is greater than a maximum 
limit. If this is so, then in a step 598, the maximum limit is returned as the scaled rate. 
Otherwise, in a decisional step 600, a determination is made whether the detected speed 
of a flow is greater than a total rate of all flows. If this is so, then in a step 602, the 
maximum limit is returned as the scaled rate. Otherwise, in a step 604, a percentage of 
the total rate, represented by the maximimi limit, is computed. Then, in a step 606, the 
percentage computed in step 604 is applied to a total rate available, i.e., the maximum 
limit - base (minimum) limit. The result is retumed as the scaled rate. 

4.0 Conclusion 

In conclusion, the present invention provides for a policy based bandwidth 
allocation method for IP packet telecommunications systems wherein bandwidth is 
allocated to requesting flows according to automatically determined application 
requirements. An advantage of network management techniques according to the present 
invention is that network managers need only define traffic classes which are of interest. 
A further advantage of the present invention is that trafflc classes may include 
information such as a URI for web trafflc. A yet further advantage of the present 
invention is that service levels may be defined in terms of explicit rates, may be scaled to 
a remote client or server's network access rate, specified for high speed and low speed 
users and defined in terms of a guaranteed minimum service level. 

Other embodiments of the present invention and its individual components will 
become readily apparent to those skilled in the art from the foregoing detailed 
description. As will be realized, the invention is capable of other and different 
embodiments, and its several details are capable of modifications in various obvious 
respects, all without departing from the spirit and the scope of the present invention. 
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Accordingly, the drawings and detailed description are to be regarded as illustrative in 
nature and not as restrictive. It is therefore not intended that the invention be limited 
except as indicated by the appended claims. 

A pseudo code description of a particular embodiment is included herein below: 
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* 
« 

* description: BANDWIDTH POOL related functions. 
** 

*/ 

#defme EIR_LEVELS 8 

#defme EIR_HIGHEST 7 

#derine EIR_LOWEST 0 

#define EIR_DEFAULT 3 

#define EIRJSTONE -1 
#define EIR_ALL_LEVELS (Oxfffi) 

/* 

* An EIR vector is a demand for Excess rate bandwidth at each of 

* the N eir prioritylevels. 
*/ 

typedefBPS EIR_VECTOR[EIR_LEVELS]; 
/♦ 

* Rate Specifier, i.e. OR + EIR... 
♦/ 

typedef struct { 

BPS cirLimit; /* How much CIR I want */ 

BPS eirDemand; /* total of eirDemandvector */ 

EIR_VECTOR eirDemandVector; /♦ How much EIR I want */ 
/* 

* UDP stuff tbd... 
*/ 

INT32 delayBound; 
} RATE_SPEC; 

#defme RATE_UNCONSTRAINED OxffflGBEKL 
/* 

* Latency status record 
*/ 

#defme LATENCY BIG TROUGH OxffffiffiUL 



typedef struct { 
INT32 
INT32 



ewmaValuc; 
lastPeak; 
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INT32 lastTrough; 
INT32 peak; 
INT32 trou^ 
TIME timeStamp; 

} LATENCY; 



typedef struct .gearshift { 
DLINK dlink; 

#define GF_INITIALIZED 

#define GF_SCALED 

#define GF_ACTIVE 

#define GF_EIR_OVERALLOCATED 

available*/ 

#defme GF_CIR_OVERALLOCATED 
available*/ 

#define GF_QUIESCENT 
#defme GF_PRIORITY 
#define GF_RATE 
#defme GF_DQ 
#define GFJTCPRC 
#defme GF.FILTCR 
#define GF_NEEDS_REALLOC 
#define GF_CIR.COMMITTED 



0x0001 

0x0002 
0x0004 

0x0008 /* allocated though none 

0x0010 /♦ allocated though none 

0x0020 
0x0040 
0x0080 

0x0100 

0x0200 
0x0400 

0x0800 
0x1000 



void* hosts[2]; /* Who the 

hosts are! */ 

INT16 flags; 
INT16 priCounted; 

BPS target; 
INT32 cirlnUse; 
INT32 eirlnUse; 

BPS lastScaledBps; 



RATE_SPEC rspec; 
} GEARSHIFT; 



/* 

* For scaling CIR to connection speed. See traffic/scalex. 

* If limit is set to 0, no CIR is allocated. 
*/ 

typedef struct { 

BPS base; 

BPS limit; 
} CIR_SPEC; 

/ 

* For scaling EIR to connection speed, i.e. creating eirLimits vector. 

* hp -highest priority, hp-base, and hp-limit and optional total limit 
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* See traffic/scalex. 

* As with CIR Spec, if hpLimit is set to 0, no EIR is allocated. 
♦/ 

typedef struct { 

INT8 hp; 

BPS hpBase; 

BPS hpLimit; 

BPS totalLimit; 
} EIR^SPEC; 



* bwpRedistributeEir(bwp, force) 

^ Called when a FLOW is added to or deleted from a tclass belong to 

* this partition (partiton is a bandwidth pool - BWJPOOL) 

* Reallocate EIR. If new allocation percentages are significantly different 

* for any priority level, then spin through the GEARs to update the 

* SATISFACTION percentage level at each GEAR (at each TCLASS) with a 

* demand at that level. 

* NOTE: Execution efficieny is critical, 
void 

bwpRedistributeEir(BW_POOL *bwp, mt forceAll) 
{ 

whereStr("bwpRedistributeEir"); 
EIR_MASK changeMask - 0; 

EIR_VECTOR eirNewPercentage, urDemand, urAllocated; 
int i, available, urTotal, totalDemand, demand, diS; 
int new, old; 

int dir = bwp->flags & BWF_DIR_MASK; 

VALIDATE_BWP(bwp); 

available = bwp->limit - bwp->cirCurrent; 
totalDemand = eirVectorSum(bwp->eirDemand); 
urTotal = bwpUqDemand(bwp, urDemand); 
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bvvp->stats.eirRedistributions++; 
/* 

* All SATISIFIED ? 
•/ 

if (available > (totalDemand + urTotal)) { 

eirVectorSet(eirNewPercentage, 100); 
bvvp->eirCun'ent = totalDemand; 
/♦ 

* And give leftovers to UQ with a little margin unallocated... 
*/ 

urTotal = max((available - totalDemand) - (bwp->limit » 3), urTotal); 
bwpUqSetTarget(bwp, urTotal); 

} 

/* 

* None SATISIFIED ? 
*/ 

else if (available <= 0) { 
/* 

* No EIR is allocated... 
•/ 

bvvp->eirCurrent = 0; 
eirVectorSet(eirNewPercentage, 0); 
/* 

* Priority buckets set to minimal output rate... 
*/ 

bwpUqSetTarget(&bwp->uq, 0); 

} 

/♦ 

* Some SATISFIED.., 
*/ 

else { 

bwpSetEirPct(bwp, available, urDemand, eirNewPercentage); 
forceAll = TRUE; 

} 

/♦ 

* OK - now check NEW versus OLD parentages to see if we have to 

* run through the gears... 
♦/ 

if (forceAll) 

changeMask = EIR_ALL_LEVELS; 

else { 

for (changeMask = i » 0; i <= EIR_fflGHEST; i-H-) { 
new = eirNe\yPercentage[i]; 
old = bwp->eirPctSatisfied[i]; 

if (new '== old) 
continue; 

else { 

/* 
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* Check to see if the % difference is significant 
*/ 

difiF = abs(new - old); 

if (difif > (old » 3)) changeMask }= (1 « i); 

} 

} 

eirVectorCopy(eirNewPercentage, bwp->eu'PctSatisfied); 
^if (changeMask) bwpUpdateGearEu^bwp, changeMask); 



* 



♦ bvvpSetEirPct(bwp, avail, urTotal, urDemand, newPctV) 

♦ Allocate available bandiwdth between UQ and EIR. Note only one 

* priority level is PARTIALLY SATISIFIED. 

*♦ 

V 

void 

bwpSetEirPct(BW_POOL ♦bwp, 

int avail, 

EIR_VECTOR urDemand, /• Unreserved demand vector to be 

filled in */ 

EIR_VECTOR newPct) /* New satisfaction percentage 

to be filled in*/ 
{ 

whereStrC'bwpSetEirPctOUT"); 
int demand, i, pet, urPct, eirPct; 
EIR^VECTOR urAUocate; 

/♦ 

* Go through each prioirty level from hi to low... 
♦/ 

for (i = EIR_HIGHEST; i >= 0; i-) { 

demand = bwp->eirDenuind[i] + urDemand[i]; 

• None left ? 
♦/ 

if (avail <= 0){ 

newPct[i] = 0; 
urAllocate[i] = 0; 

} 
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/♦ 

* More than enough - 100% for both.., 
*/ 

else if (avail >= demand) { 
newPct[i] = 100; 
urAllocate[i] = urDemand[i]; 
avail -= demand; 

} 

else { 

/♦ 

* Have to divide available here between UQ and reserved flow EIR... 
♦/ 

if (bwp->eirDemand[i] = 0){ 

urAllocate[i] = avail; 
newPct[i] = 0; 

} 

else if (urDemand[i] = 0){ 

newPct[i] = (avail * 100)/demand; 
iirAllocate[i] == 0; 

} 

else { 

/♦ 

♦ Need to share. The overall level is satisfied N 

♦ percent. UQ and rate EIR each get their proportional 

♦ share of that N percent.. 
♦/ 

pet = max(l, (avail* lOOydemand); 

\irPct = max(2, (urDemand[i]*100)/demand); 

urPct = min(98, urPct); 

eirPct = 100 - urPct; 

assert(eirPct >= 2 && eirPct <= 98); 

newPctp] = max(2,(eirPct*pct)/100); 
urPct = max(2,(urPct*pct)/l 00); 

urAllocate[i] = (urDemand[i] * urPct)/100; 

} 

/♦ 

* Only one priority level requires this calculation - 

* all lower levels will be 0 percent satisified... 
♦/ 

avail = 0; 

> 

> 

/♦ 

* Set the output target for the unreserved priority traffic based 
on the total unreserved vector... 

*/ 

bwpUqSetTarget(bwp. eirVectorSum(urAllocate)); 
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* bwpUpdateGearEir(bwp, changeMask) 
* 

* Cycle through all GEARs using EIR at the specified priority levels and 

♦ change their SATISFACTION percentage allocations. 

♦ NOTE: Execution eflBcieny is critical, 
void 

bwpUpdateGearEir(BW_POOL *bwp, EIR^MASK changeMask) 
{ 

whereStrC'bwpUpdateGearEIR"); 

TCLASS ♦tclass; 

GEARSfflFT ♦g; 

int i, tclassTotal, poolTotal; 

INT32 ♦evPct ^ bvvp->eirPctSatisfie<t 

INT32 *evActual, ♦evDemand; 

blurt2("pool %s mask %x", bwp->root->name, changeMask); 
bwp->stats,gearUpdates++; 

poolTotal = 0; 

♦ For ail TCLASSs using this BW POOL... 
*/ 

for (tclass = bwp->tclassList; tclass != NULL; tclass = tclass->bwpThread) { 
/♦ 

♦ For all GEARs at this TCLASS... 
♦/ 

tclassTotal = 0; 

for (g = tclass->activeList; g != NULL; g = (GEARSfflFT ♦)g->dlink.next) { 
/♦ 

♦ Are any of the GEARs EIR levels affected - That is, does 

♦ the flow have a demand at a changed Satisfaction Percentage 

♦ priority level ? 
♦/ 

if (! IS_FLAG_SET(g->flags, GF_DEMANDS_EIR)) 
continue; 

if ((changeMask & g->rspec.eirActiveMask) == 0){ 
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tclassTotal += g->rspec.eirTotal; 
continue; 

} 

/* 

* Can we now allocate for an overalloced flow ? 
♦/ 

if (IS_FLAG_SET(g->flags, GF_EIR.OVERALLOCATED)){ 

blurt 1 ("recovering from eir-overalloc tciass %s", tclass->name); 
tclassFreeOverallocEir(tclass, g->rspec.eirTotal); 
g->rspec.eirTotal = 0; 

g->flags &= ~GF_EIR_OVERALLOCATED; 

} 

/♦ 

* OK - allocate our N percentage levels. Note that no more 

* than 1 level is not zero or 100. 
♦/ 

evActual = g->rspec.eirActual; 
evDemand = g->rspec.eirDemand; 

for (i = 0; i <= EIR.fflGHEST; i-H-) { 

if ((changeMask & (1 « i)) = 0) continue; 

if (evPct[i] = 100) 

evActualp] = evDentiandp]; 
elseif(evPct[i]=0) 

evActual[i] = 0; 

else 

evActuaI[i] = (evDemandp] ♦ evPct[i])/100; 

} 

/♦ 

* Note we need to update g->rspec.eirTotal and g->target 
*/ 

g->rspec.eirTotaI = eirVectorSum(g->rspec,eirActual); 

tclassTotal += g->rspec.eirTotal; 

/♦ 

* And set target rate... 
♦/ 

g->target.bps = g->rspec.eirTotal + g->rspec.cirTotal; 
/♦ 

* Check too see if we got ALL of our bandwidth taken away. 

* If so OVERALLOCATE some... 
♦/ 

if (g->target.bps = 0) bwpOverAllocEir(bwp, tciass, g); 
/♦ 

* Update TCLASS total... 
♦/ 

tclass->eirCurrent = tclassTotal; 
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} /* For all GEARS */ 
poolTotal += tclassTotal; 

} /* For all TCLASSs */ 

bwp->eirCurrent = poolTotal; /* Set the total EIR in use for the pool */ 

} 

* bwpUqDemand(bwp) 
♦ 

* Return current Memand' for unreserved bandwidth, EXTRAPOLATED from 

* current output and demand rates... 

int 

bwpUqDemand(BW_POOL *bwp, EIR_VECTOR demandV) 
{ 

whereStr("bwpUqDemand"); 
UQ *uq = &bwp->uq; 
BPS demandRate = 0; 
int old = uq->target; 
int i - 0; 

BPS outputRate = exponential weighted moving average of uq->outputRate 

if ((bwp->flags & BWF^DIR^MASK) = DIR_INBOUND){ 
eirVectorSet(demandV, 0); 
retum(outputRate); 

} 

for (i = EIR^fflGHEST; i >= 0; i-){ 

deinandV[i] = exponential weighted moving av^age of uq->demandRate[i] 

Add overstatement factor to demand at this level 

demandRate += demandVp]; 

} 

demandRate = max(outputRate, demandRate); 
uq->target = min(demandRate, bwp->limit); 
blurt2("new %ld old %ld", uq->target, old); 
retum(uq->target); 

} 
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TCB - transport control block - TCP State information for both directions 



* BCB - Buffer Control Block. Contains packet info, including parsed 

* flow spec, as well as pointers to various layers of the actual 

* packet buffer. 
•/ 



#define BCB_BUF_SIZE 
#define BCB_MAGIC 

typedef struct _bcb { 
struct _bcb *next; 
struct bcb 



1S48 



Oxdeadbeef 



"last; 



INT32 
int 

INT8 



linkHeaderLen; 



♦/ 



frameMaxSize; 
""fiiameStart; 



embedded at iiameStart-4 ♦/ 
INT8 *dataStart; 
length ♦/ 

int dataLen; 
USEC_TIME tick; 
output and 

timestamp of when it was sent 
FLOW_SPEC fspec; 



protLength; 
•/ 



INT8 *protHdrStart 
whatever is ♦/ 

INT16 
IP packet. 

OBJECTJD tclassid; 
*/ 

}BCB; 
/* 

* FLOW components and object 
♦/ 

typedef unsigned char DIRECTION; 

#define DIR_INBOUND 0 
#define DIR OUTBOUND 1 



/* length of link header 

/* Actual buffer memoiy 

/* Note a pointer to the TOP of the ♦/ 

/* bcb is 

I* Current Data pointer and 

I* time the packet ^ould be 
/* 

*/ 

/♦ Parsed FLOW description... 
/* Points to and length of 

I* Encapsulated in the 
/* class ID for accounting purposes 
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typedef struct { 

INT16 port; 

INT32 host; 
} SOCKET; 

typedef struct { 

SOCKET inside; 

SOCKET outside; 
} CONN; 

typedef struct _flow_j!pec { 
CONN conn; 

#defme PFJNET 0 
#defme PF_IPX 1 
#define PF_ATALK 2 
#define PF_NETBIOS 3 
#define PF_SNA 4 
#define PF_FNA 5 

INT8 protPamiiy; 

INT8 tos; 

INT8 ipProtocol; 

INT8 direction; 

Servicelndex service[NUM_DIRECTIONS]; 
INT16 uriLen; 
char *uri; 
} FLOW_SPEC; 

/* 

* For storing URIs 
*/ 

#define MAX_URI_NAME_LEN 128 

typedef struct _uri_buf { 
struct _uri_buf *next; 

char name|>lAX_URl_NAME_LEN]; 
} URI_BXJF; 

/♦ 

* For matching URIs 
♦/ 

typedef struct { 

#defme URI_FLAG_QUERY 
*/ 

#define URI_FLAG_SPECIAL_EXACT 



(0x01) /* •?» present? 

(0x02) /* special-case exact-match? */ 
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#define URI_FLAG_SPECIAL_LEFT (0x04) 
*/ 

#define URI_FLAG_SPECIAL_RIGHT (0x08) 
♦/ 

unsigned char flag; 
bitmask ♦/ 

unsigned char count; 
♦/ 



URI_BUF 
non-nil) */ 

URI^BUF 
non-nil) */ 

} URI^SPEC; 



'''uriBufCompile; 
*uriBufLiteral; 
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/* do special-case left-match? 
/* do special-case right-match? 

/♦ UR1_FLAG_* 
/* no. of segments in uriBufQ 

/* the compiled URI (if 
/♦ the original URI (if 



/* 

* Define Traffic Class Specifier... 
♦/ 

typedef struct _t_spec { 

FLOW^SPEC fspec; 

FLOW^SPEC fspecMask; 

URI_SPEC uriSpec; 
} TSPEC; 



/♦ present if uriSpec.uriBufCompile != 0 ♦/ 



/* 

* TCLASSs are directional. There is both an INBOUND and OUTBOUND 

* root partition. NOTE this accomodates ASYMMETRIC links. 

* INBOUND partition refers to direction of data flow, not of connection 

* origin 

* TCLASSs have a component specifications (lists of TSPECS) which is 

* ordered by increasing specificity. The global flow implicitly matches 

* all traffic classes. 

* TCLASSs are orgamzed into an N-ARY tree, where children of tclasses 
^ are arranged in order of exception, normal, and inheritable. 

* Sorting/searching order is 

* exception tclasses without applied policy 

* exception tclasses with applied policy 

* tclasses >yithout policy 

* tclass with policy 

* inheritable tclass with policy (inheritable has policy by definition) 

* within each group, tclasses are ordered by Decreasing Match Specificity 
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* A FLOW is matched to a policy as follows: 

* Starting at the global partition (implict component match), 
♦ 

* a MATCHALL tspec/policy exists at the top level for both the 
^ inbound and outbound tree. 

tclass_check(tclass, flow) 

♦ 



Intermediate traffic classes (those with children) do NOT 
have associated policies. 



flow) 



tclass are checked in above specified order against flow. 
If they match, recurse to call tclass_check(match_tclass. 



* OK - we have found the most specific tclass. If it has an 

* associated policy, we are done. Else we need to check back 
up 

* thorugh the tree for an inherited policies. 



* Checking for inherited policies goes up to parent, and then 

* down through parents remaining siblings, then recurses up 
to 

* parents parent, etc... till a match is found. 



typedef struct _tclass { 
struct _tclass ♦sibling; 
struct _tclass ♦child; 
struct _tclass ♦parent; 

#define TCF JNBOUND DIR^INBOUND 

#define TCF_OUTBOUND DIR__OUTBOUND 

#define TCF_DIR_MASK 0x0001 

#define TCF_ACTIVE 0x0002 

#define TCF JDLE 0x0000 

#define TCF_GLOBAL 0x0004 

#define TCF^DISABLED 0x0008 

#define TCF^EXCEPTION 0x0010 

#define TCF^INHERITABLE 0x0020 

#define TCF^STANDARD 0x0000 

#define TCF^TYPE^MASK (TCF^EXCEPTION+TCF^INHERITABLE) 

#define TOLAS S,TYPE(t) . ((t)->flags & TCF_TYPE_MASK) 

#define TCF^WEB 0x0100 

#define TCF AUTO CLASSIFYING 0x0200 
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#define TCF_AUTO_CREATED 0x0400 

INT16 flags; 
INT8 direction; 

OBJECT JD id; 

INT32 hits; 

/♦ 

* Rate pool stuff ... Need to have EIR limits/priorities as well 

* as CIR limits.. 
♦/ 

PKT_ACCUM pktAccum; 
EWMA STATS traffic; 



TSPECJTEM ♦imsTspecList; " /♦ IMS 

ordered TSPEC ITEMs */ 

POLICY JTEM *policy; /♦ 

May be NULL */ 
/♦ 

Points to self or root class is CIR set or inherited or NULL if 

* CIR is not enforced... 
♦/ 

struct _tclass *cirRoot; 

BPS cirRootLimit; 
BPS cirRootCurrent; 
/♦ 

^ Values for this Tclass... 
♦/ 

BPS cirCurrent; 
BPS eirCurrent; 
BPS cirOverAUocated; 
BPS eirOverAllocated; 

BPS cirCommitted; /♦ Amount of 

CIR committed for outstanding HTTP request */ 
LATENCY xactLatency; 

CIR_STATS cirStats; 
/* 

* List of active sessions/connections^.. 
♦/ 

DLINK activeList; 

struct _bw_pool ♦bwp; 
struct _tclass *bwpThread; 

char name[SYS_OBJ_NAME_LEN]; 



} TCLASS; 
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♦ Abstracted packet path... 

main_pktj>ath() 
{ 

if its an IP input packet, it is parsed with parseTspec to 
embed an fspec inside the bcb 

the packet is processed by the matching protocol 
finite state machine 

} 

* ipParseFlow(bcb) 

♦ Build basic flow spec for the given input packet. Just parse basic ip 

* info and leave headers in net format. 

*/ 

int 

ipParseFiow(BCB *bcb) 
{ 

FLOW_^SPEC *fspec = &bcb->fspec; 
CONN ♦conn = &fspec->conn; 
INT8 ♦cp = NULL; 
INT16 len, hdrLcn; 

memset(fspec, 0, sizeof(FLOW_SPEC)); 
fspec->protFamily = PF_INET; 

/♦ 

* Extract IP source and destination and protocol type from net header. 
♦/ 

if (bcb->device = INSIDE JFACE) { 

conn->inside.host = get32(&bcb->dataStart[12]); 
conn->outside.host = get32(&bcb->dataStart[16]); 

} 

else { 

conn->inside.host = get32(&bcb'->dataStart[16]); 
conn->outside.host = get32(&bcb->dataStart[12]); 

} 

fspec->ipProtocol = bcb->dataStart[9]; 
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fspec->tos = bcb->dataStart[l]; 
/♦ 

* Advance to next protocol header,.. 
*/ 

hdrLen = ((bcb->dataStart[0] & OxOf) « 2); 

cp = bcb->dataStart + hdrLen; 

/* 

♦ If it is TCP or UDP extract ports... 
*/ 

if (fspec->ipProtocol = TCP_PTCL \\ fspec->ipProtocol = UDP^PTCL) { 
if (bcb->device INSIDE JFACE) { 
conn->inside.port = getl6(cp); 
conn->outside.port = getl6(cp + 2); 

} 

else { 

conn->inside.port = getl6(cp + 2); 
conn->outside.port = getl6(cp); 

} 

} 

bcb->protHdrStart = cp;„ 

len = getl6(&bcb->dataStart[2]); 
bcb->protLength = Icn - hdrLen; 

fspec->direction = (bcb->device = INSIDE JFACE) ? DIR_OUTBOUND : 
DIR_INBOUND; 

retum(OK); 

} 



NOTE: WEBNESS may be set for a connection by virtue of that connection being 
on the well known HTTP port (80), or by an explicit user configuration 
indicating that a certain port if for a web server, or by automatic recognition 
process which detects an http 'signature' in traffic destined towards a 
certain repetitively used (server) port. 

tcp_£inite_state_machineO 
{ 

handles connection establishment and clearing 

for data on an established connection, if 
connection has WEBNESS, check to see if it an 
http request. 
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* httpParseRequest(tcb, bcb, tcpHdr) 

* If we find a new URI from a GET operation, re-lookup the policy for the 

* OTHER direction flow. So the HTTP BURST will effectively start from the 

* time of the GET. When the actual data burst starts in the other 

* direction, this policy will already be set, i.e. tmgrSetPoIicyQ will 

* do nothing. 

♦/ 
int 

httpParseRequest(TCB *tcb, BCB ♦bcb, TCP_HDR ♦tcpHdr) 
{ 

whereStr("httpParseRequest"); 
int dir = bcb->fspec.directior^ 
int len, rc; 

CONN *coim = &bcb->fspec.conn; 
char *iiri, *method; 

HALF^CONNJNFO ♦hci - &tcb->halfConns[dir]; 
HALF_CONN_INFO ♦otherHci = hci->other, 

/* 

*Is it a GET or POST? 
♦/ 

if ((len = httpGetUri((char ♦)tcpHdr->dataStart, tcpHdr->dataLen, TRUE, &uri, 
&method)) > 0) { 

/* 

* Set URI pointer in flowspec embedded in BCB... 
♦/ 

bcb->fspec.uri = uri; 
bcb->fspec.uriLen - len; 

/♦ 

* httpSetPolicies will CLASSIFY this URL and find policy... 
*/ 

rc - ht^SetPolicies(tcb, bcb, tcpHdr); 

if (rc NOTOK) { 

admitFailHttpGetRequest(tcb. bcb, tcpHdr); 
return NOTOK; 

} 

/♦ 



wo 98/26510 PCT/US97/22550 

47 

* Check for Admission Control... 
*/ 

if (admitHttpGeKtcb, dir. bcb, tcpHdr, (rc = OK_BUT_NO_CIR)) != OK) 
return NOTOK; 

bcb->fspec.uri = NULL; 
/* 

* This is the start of a burst on the other connection direction (HCI)« 
♦/ 

otherHci->buistState = HCF_BURST_WAITING_FOR_REPLY; 
otherHci->reqStamp = sysTimeStampO; 

} 

else ( 

if (method && httpValidateMethod(method)){ 
httpFsmMibOtherC)ps++; 

if (tmgrInitTcpHalf(tcb, dir, NULL. 0) != OK j| 

tmgrImtTcpHalf(tcb, OTHER_DIRECnON(dir), NULL, 0) != 

OK){ 

admitFailHttpGetRequest(tcb, bcb, tcpHdr); 
retum(NOTOK); 

> 
} 

} 

retum(OK); 

> 

♦ httpGetUri(pkt, pktLen, getOrPostOnly, pUri, pMethod) 

* Check for a valid HTTP request If valid, parse URI and method. 

♦/ 

int 

httpGetUri(const char *httppkt, 
int pkden, 
int gctOrPostOnly, 
char **pUri, 
char ♦♦pMethod) 

{ 

whereStr("httpGetUri"); 
char *header, *protocol, ♦method, ♦op; 
int headerlen, methodlen, plen, urilen; 
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♦pMethod = NULL; 
method = NULL; header = NULL; protocol = NULL; 
headerlen = methodlen = plen = urilen = 0; 

headerlen = geuine(httppkt,plctien,&header); 

if ((header = NULL) \\ (headerlen <= 0)) return (-1); 

method = header; 

method += strspn(method," \t"); 

if (method == NULL) 
return (-1); 

'^'pMethod » method; 

if (getOrPostOnly){ 

if (stmcmp(method,"GET",3) != 0 && stmcmp(method,"POST",4) 0) 
return (-1); 

} 

else if (httpValidateMethod(method) = 0) 
retum (-1); 

♦pUri = strpbrk(method," \t"); 
methodlen = *pUri - method; 
♦pUri += strspn(*pUri," \t"); 

protocol = strpbrk(*pUri," \t"); 
urilen = protocol - *pUri; 

cp = strpbrk(protocol,"\r"); 
plen = cp - protocol; 
cp += strspn(cp,"\r\n"); 

if (plen <= 0) 
{ 

attn("no protocol"); 
urilen = 0; 

) 

retum urilen; 

} 

♦ httpSetPolicies(tcb, getFspec) 
♦ 
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** 

*/ 

httpSetPoliciesCTCB *tcb, BCB *bcb, TCP_HDR *tcpHdr) 
{ 

int dir = bcb->fspec.direction; 

HALF_CONN_INFO *otherHci = &tcb->halfConns[OTHER_DIRECTION(dir)]; 

TCLASS* tclass; 
POLICY JTEM* pi; 

HALF_CONN_INFO ♦hci = otherHci->other; 

int setResuit, tos; 

* Set any service abstraction in bcb.fspec... 
♦/ 

serviceSetFlowSpec(&bcb->fspec, tcb->service, tcb->flags & TF_DIR_MASK); 



bcb->fspec.direction = OTHER_DIRECTION(dir); 
autoclassification */ 



/* need correct direction for 



setResult = tmgrSetPolicy(&otherHci->policyId, 



TRUE); 



&otherHci->tclassId, 

&bcb->fspec, 

tcpSpeedGet(tcb), 

&otherHci->gears, 

tcb->hosts. 



bcb->fspec.direction = din 

TCB_SET_POLICY(tcb, OTHER_DIRECTION(dir)); 
/♦ 

♦ Now initialize the client->server direction. 
♦/ 

if (tmgrInitTcpHal£(tcb, dir, bcb->fspec.uri, bcb->fspec.uriLen) = NOTOK) 
return NOTOK; 



return (setResult = OK) ? OK : OK_BUT_JNO.CIR; 



} 



♦ tmgrSetPolicy(pidPtr, tclassIdPtr, fspec, bps, gears, hosts) 

Called at the appearance of a BURST or an HTTP REQUEST. Called by 
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♦ both TCP and UDP state machines. 

♦ Set allowed CIR/EIR based on current link state, traffic class, and 

♦ connection speed... 
♦ 



*/ 

int 

tmgrSetPolicy(OBJECTJD *pidPtr, 

OBJECT JD *tclassIdPtr, 
FLOW_SPEC ♦fspec, 
SPEED bps, 
GEARSfflFT ♦g, 
void *hosts, 
int forceHits) 

{ 

whereStr("tmgrSetPoiicy"); 

POLICY JTEM *oldPi = NULL; 

POLICY JTEM ♦pi = NULL; 

TSPECJTEM ♦ti; 

OBJECT JD oldPid = ♦pidPtr; 

OBJECT JD oidTclassId = ♦tclassIdPtr; 

TCLASS *tclass = NULL; 

BOOL newUri; 

BOOL newPolicyOrTclass ^ FALSE; 

int latency, confidence, retval; 



/♦ 

* Do we already have a policy and tclass set ? could be a re-lookup, e.g. 

* persistent HTTP second click... 
♦/ 

if (pidPtr.>vaI != 0) 

pi ~ oldPi = policyIdToPolicy(*pidPtr); 
if (tclassIdPtr->val != 0) 

tclass = tclassIdToTclass(*tcIassIdPtr); 

/♦ 

* If it is URI based, we may be changing policy/tclass. So we 

* need to free any bandwidth here we may currently own. 
♦/ 

newUri = (fq)ec->uri != NULL); 
/♦ 

* Lookup Traffic Spec if policy is valid... 
*/ 

if (newUri H oldPi = NULL |1 tclass = NULL) { 
newPolicyOrTclass = TRUE; 
pMatchSpecial(fspec); 
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/* 

* Find the class and policy for a flow in this direction... 
*/ 

pi = tclassCheck(fspec, GIobalTclasses[fspec->direction], &tclass, &ti); 
*pidPtr = pi->id; 
*tclassIdPtr = tclass->id; 
doAccounting = TRUE; 



if (newUri) { 
/* 

* If policy changed, clear the gear state. Else gear is'simply in 

* QUIESCENT state and can be quickly reactivated... 
*/ 

if (oldPi pi il oldTclassIdval !« tclass->id.val) { 
/* 

* NewPolicy is TRUE... 
♦/ 

tmgrMibSwitchedPolicies++; 
trngrReleaseBwidthColdTclassId, g); 
tnigrResetGear(g); 

} 

else 

newPolicyOrTclass = FALSE; 

} 

/* 

* If it is an existing policy, check to see if speed has changed... 
*/ 

if (newPolicyOrTclass = FALSE && g->flags != 0) { 
/* 

* Speed changed ? 
♦/ 

if (g->lastScaledBps != bps) 

tnigrMibSpeedChanges++; 

/* 

* Re-awakening ? 
*/ 

else if (IS_FLAG_SET(g->flags, GF^QUIESCENT)) 
tmgrMibRebursts++; 

/♦ 

* No change... 
♦/ 

else { 

tmgrMibNoChanges++; 
return OK; 

} 

} 

/* 

* Else it is a new policy. Shouldn't be any previous allocation 
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* unless the in force policy was deleted and this gear changed 

* rate or P-HTTP... 
♦/ 

else { 

if (oldPi i= pi j] oldTclassId.val != tclass->id.val) 

doAccounting = TRUE; 
if (IS_FLAG_SET(g->flags, GF_ACTIVE)){ 

tmgrMibSwitchedPolicies4-f-; 

tmgrReleaseBwidth(oldTclassId, g); 

tmgrResetGear(g); 

} 

else 

tmgrMibNewPolicies++; 

} 

if (newUri) { 

retval = policyCheckBwidth(pi, tclass, g, bps); 

} 

else { 

policyAllocBwidth(pi, tclass, g, bps); 
retval = OK; 

} 

return retval; 

} 

* tclassCheck(fIow, tclass, tclassPtr, tspecItemPtr) 

* Check child tclass tspecs and RECURSE until leaf tclass matches. 

* If final traffic class has a policy, apply it Else search back up 

* through the class tree for an inheritable policy at a matching traffic 

* class. 
♦ 

* We assume direction is already checked. 
POLICY JTEM ♦ 

tclassCheck(FLOW_SPEC ♦fspec, TCLASS »tclass, TCLASS ♦♦tclassPtr, TSPECJTEM 
♦♦tiPtr) 

{ 

POLICY JTEM *pi; 
TCLASS ♦child; 
whereStr("tclassCheck"); 
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/* 

* Check Child traffic classes... 
*/ 

for (child = tclass->child; child != NULL; child = child->sibling) { 

/* 

** Do we match this child's tspec(s) ? 
*/ 

if (tclassMatchTspec(fspec, child->iinsTspecList, tiPtr)) { 
/• 

* We matched. RECURSE to check matching child... 
♦/ 

blurt2("branching to %s from %s", child->name, tclass->name); 
if ((pi = tclassCheck(fspec, child, tclassPtr, tiPtr)) != NULL) 
return pi; 

} 

} 

if (tclass->child !-NULL) { 
♦tclassPtr = NULL; 
return NULL; 

} 

/* 

* Is a leaf, belongs to this traffic class. If we have a policy - 

* - apply it. Else search upwards till we find a matcbdng 

* inheritable policy... 
*/ 

*tclassPtr = tclass; 

autoRecordFlow(tclass, fspec); 

if ((pi = tclass->policy) != NULL) { 

blurt 1 ("policy at tclass %s'*, tclass->name); 
retum(pi); 

} 

/♦ 

* No match, find inheritable policy for this flow... 
*/ 

assert((pi = policyFindDefault(tclass, fspec)) !» NULL); 
retiim(pi); 

} 



* tclassMatchTspec(fspec, tclass, tiPtr) 
♦ 

* Check to see if a flow is a component of a traffic class. 
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V 

int 

tclassMatchTspec(FLOW_SPEC *fspec, TSPECJTEM *clist, TSPECJTEM ♦♦tiPtr) 
TSPECJTEM *ti; 
♦tiPtr = NULL; 

for (ti = clist; ti != NULL; ti ti->next) { 
if(tspecIsMatchAll(&ti->tspec)) 
retum(TRUE); 

if(fspec->protFamily N ti->tspec.fspec.protFamily) 

continue; 
if(fspec->protFamily != PF_INET) 

retum(TRUE); 

if(ti->tspec*fspec.ipProtocoI && fspec->ipProtocol != ti->tspec.fspec.ipProtocol) 
continue; 

if (tspecMatch(fspec, &(ti->tspec))) { 
*tiPtr = ti; 
retumCTRUE); 

} 

> 

retum(FALSE); 

} 

* tspecMatch(FLOW_SPEC ♦fl, TSPEC ♦tZ) 

♦ NOTE fl is the live flow and t2 is tbe TSPEC. 

*/ 

int 

tspecMatch(FLOW_SPEC ♦fl, TSPEC *t2) 
{ 

FLOW^SPEC *f2 = &(t2->fspec); /♦ we treat tspec as a 2nd flow ♦/ 

CONN ♦mask = &(t2->fspecMask.conn); /* but latch the mask separately ♦/ 
int prec; 

CONN ♦cl = &fl.>conn; 
CONN *c2 = &f2->conn; 



wo 98/26510 PCT/US97/22550 

55 

INT32 hmask; 

if(fl->inService && f2->inService && (fl->inService != G->inService)) 
retum(FALSE); 

if (mask->insicie.port && cl->iiiside.port && 

((cl->inside.poit < c2->inside.port) [j (cl->inside.port > mask->inside.port))) 
retum(FALSE); 

if (((hmask = mask->inside.host) != 0) && 

((cl->inside.host & hmask) !== (c2->inside.host & hmask))) 
retum(FALSE); 

if(fl->outService &.&, f2->outService && (fl->outService != f2->outService)) 
retum(FALSE); 

if (mask->outside.port && cl^outside.port && 

((cl->outside.port < c2->outside.poit) || (cl->outside.port > mask->outside.port))) 
retum(FALSE); 

if (((hmask = mask->outside.host) != 0) && 

((cl'->outside.host & hmask) !- (c2->outside.host & hmask))) 
retum(FALSE); 

/* 

* No port spec for ICMP... 
♦/ 

if(fl->ipProtocol = ICMP^PTCL 

&& (c2->inside.port j[ c2->outside.port |1 f2->inService > SVC_CLIENT || 
f2->outService > SVC_CLIENT)) 
retum(FALSE); 

/♦ 

* IP Precedence is an orthiginal classification element... 
♦/ 

prec = PREC(fl->tos); 

if(prec < PREC(f2->tos) jj prec > PREC(t2->fspecMask.tos)) 
retum(FALSE); 

/♦ 

* Check for URI here !!!!!! WILDCARD match... 
♦/ 

if (t2->iiriSpec.uriBufCompile) { 
if(fl->uri = NULL) 
retum(FALSE); 

retum(httpIsUriMatched(&(t2->uriSpec), fl->uri, fl->uriLen)); 

} 

retum(TRUE); 

} 
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* httpIsUriMatchcd(iiriSpec,stringBuf,stringLen) 
* 

* This matches the URL in stringBuf[] to the 

* (previously-compiled) URI_SPEC ♦uriSpec. 

* Returns nonzero on match; zero on no-match. 

♦ * 

int 

httpIsUriMatched(URI_SPEC ♦uriSpec,char ♦stringBuf,mt stringLcn) 
{ 

char ♦s - stringBuf; 

char *stringEnd = festringBuflstringLen]; 

char *IistPtr; 

char *itemPtr; 

int itemLen; 

char *itemEnd; 

int i; 

char ich; 

char sch; 

char *p, ♦pEnd; 

char pch; 

/♦ 

♦ Skip over leading "http:" 
♦/ 

if((stringLen >= 6) && 

(!memcmp(s,"http:",5) H !memcmp(s,"HrrP:",5))) { 

stringLen — 5; 
s += 5; 

} 

listPtr = uriSpec->uriBufCompile->name; 
/* 

* Special case? 
♦/ 

if((uriSpec->flag & 

(URI_FLAG_SPECIAL_EXACT ' 
URI^FLAG^SPECIAL^LEFT } 
URI_FLAG_SPECIAL_RIGHT)) != 0) { 

/♦ 
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* <literal> special-case? NOTE: This special-case is REQUIRED, as 

* the generic code below does NOT handle the literal-only case 

* correctly. Doing so would add more code below. 
*/ 

if((uriSpec.>flag & URI^FLAG^SPECIAL^EXACT) != 0) { 
/♦ 

* Length to end of sQ must match exactly. 
♦/ 

if((itemLen = ♦listPtr++) != stringLen) 
retum(0); 

} 

/* 

* <literal>* special-case? NOTE: This special-case is correctly 

* handled below; this is a perfonnance hack. 
*/ 

else if((uriSpec->flag & URI_FLAG_SPECIAL_LEFT) != 0) { 
/♦ 

* sQ must be equal or longer than compare string 
*/ 

if((itemLen = ♦listPtr++) > stringLen) 
retum(0); 

) 

/♦ 

* *<literal> special-case? NOTE: This q)ecial-case is correctly 

* handled below; this is a perfonnance hack. 
♦/ 

else if((uriSpec->flag & URI_FLAG_SPECIAL_RIGHT) != 0) { 
++listPtr; 
/* 

* sQ must be equal or longer than compare string. 
*/ 

if((itemLen = *listPtr++) > stringLen) 
retum(0); 

/♦ 

* We compare from the right 
*/ 

s = (stringEnd - itemLen); 

> 

/♦ 

* Case-insensitive compare, stringBufQ and listPtr[], itemLen bytes 

* (assimies listPtr[] already lower-cased in httpCompileUriQ) 
•/ 

for(; itemLen != 0; itemLen—) { 
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sch = 

if(isupper(sch)) sch = tolower(sch); 

if(sch ♦listPtrH-) 

rctam(O); 

} 

/* 

* If we did not fail to match, we must match! 
*/ 

retum(l); 

} 

/* 

* Loop thru segments 
*/ 

for(i = uriSpec->count; i != 0; i~) { 
/• 

* Latch length and end of this segment 

* If zero length, matches immediately. If nonzero, search for 

* match. 
♦/ 

ifiKitemLen = ♦listPtrH-) != 0) { 
/♦ 

* Cannot match if remaining string is shorter than item 
♦/ 

if(itemLen > (stringEnd - s)) 
retum(0); 

/♦ 

* Latch beginning and end of this segment. 
*/ 

itemPtr = listPtr; 

itemEnd = (itemPtr + itemLen); 

/• 

* First segment must match exactly on the front 
*/ 

i£(i == uriSpec->count) { 
do { 

if((ich = *itemPtr) != '?') { 
sch = *s; 

if(isupper(sch)) sch = tolower(sch); 
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/♦ if byte doesn't match, NO MATCH ♦/ 
if(ich != sch) 

retum(0); 

} 

S-H-; 

} while(++itemPtr < iteiiiEnd); 

} 

/♦ 

* Last segment mxist match exactly on the end. 
♦/ 

else if(i = 1) { 

s = (stringEnd - itemLen); 
do { 

if((ich = ♦itemPtr) != '?') { 
sch = *s; 

if(isupper(sch)) sch tolower(sch); 

/♦ if byte doesn't match, NO MATCH ♦/ 
if(ich != sch) 

retiim(0); 

} 

S++; 

} while(-H-itemPtr < itemEnd); 

> 

/* 

* Following segments can accept intervening strings (for 
*/ 

else { 

/* 

* End of region where tills segment can start 
♦/ 

pEnd = (stringEnd - itemLen +1); 

again: 

/* 

* Search for matching first byte (unless it is *?') 
•/ 

if((ich = ♦itemPtr) != '?') { 
p = s; 
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for(:;) { 

pch = *p; 

if(isupper(pch)) pch = tolower(pch); 

if(ich = pch) { 
s = p; 
break; 

} 

if(!(++p < pEnd)) 
retum(0); 

} 

} 

/♦ 

♦ Skip over that byte. 
*/ 

++s; 

-H-itemPtr; 
/♦ 

* Match remmning bytes 
*/ 

for(; itemPtr < itonEnd; s++, itemPtr++) { 
ifl[(ich = *itemPtr) *?') { 
sch = *s; 

if(isupper(sch)) sch « tolower(sch); 

if(ich != sch) { 

/♦ if out of string, NO MATCH */ 
if(itemLen > (strii^End - ++s)) 
retum(0); 

/♦ if not out of string, try again */ 
itemPtr = lisfftr; 
goto ^ain; 

} 

> 

} 

> 

I* . 
* Go on to next segment 

*/ 

listPtr » itemPtr, 
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} 

} 

/* 

* If we got here, MATCH! 
*/ 

return(l); 
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For building a classification tree by adding a class or adding an additional tspec to 
an exisitng class and maintaining the specificity order automatically, we 
have the following... 

* tclassAddTspec(tclass, tspec) 

* Allocate a TSPEC_ITEM and enlist it at this traffic class, 
slaticf void 

tclassSetupTspec(TCLASS ♦tclass, TSPEC JTEM ♦ti) 
{ 

TSPEC^ITEM ♦♦1st = &tclass->imsTspecList; 

while (♦Ist != NULL && tspecLessSpecific(*lst, ti)) 

1st = &(^lst)->next; 
tionext ♦Ist; 
♦lst = ti; 

if (ti = tclass->imsTspecList && ti->next != NULL) { 
tclassRemoveQiiId(tcIass); 
tclassInsertChild(tclass->parent» tclass); 

> 

tefgConfigChanged = TRUE; 
} 

* void 

* tciassNew(parent, dir» name, tflags, tspec, tclassPtr) 

* Make sure TCLASS and BWP direction are the same* 

♦/ 

void 

tclassNewCrCLASS ♦parent, 
int dir, 
char came, 
int tflags. 
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TSPEC *tspec, 
TCLASS **tclassPtr) 

{ 

whereStr("tclassNew"); 

TCLASS *tclass; 

TSPECJTEM *ti; 

INT32 dummy; 

BW_POOL *bwp; 

int global = (parent = NULL) ? TRUE : FALSE; 

if ((tclass = tclassAilocO) == NULL) { 
fatalCOut of TCLASSs!"); 

} 

tclass->flags |= (dir & TCF_DIR_MASK) | tflags; 

if (global = FALSE) { 

ti = tclassGefrspec(tclass, tspec); 

VALIDATE_TCLASS(parent); 
tclass->parent * parent; 

if (tclassValidateMembership(parent, tspec, TRUE) != OK) { 
fatal("TSPEC not reachable"); 

> 

bwp = parent->bwp; 

} 

else { 

bwp = GlobalBwPools[dir]; 

tclass->Hags |= TCF_GLOBAL+TCF_CIRLIMIT_NONE; 

} 

/* 

Thread the tclass into its bandwidth pool as well... 

♦/ 

tclass->bwp = bwp; 
bwpAddTclass(bwp, tclass); 
/* 

* If new tclass is NOT global, thread it into the classifier tree. 

* It should be inserted in Match Specificity order onto the 

* child.sibling list 
*/ 

if (global == FALSE) { 

tcIassSetupTspec(tclass, ti); 
tclassInsertChild(parent, tclass); 
bwpRedistributeBwidth(dir); 

} 

*tclassPtr = telass; 

) 
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mm 

♦ tclassInsertChild(parent, tclass) 

♦ Insert in child list in DMS order. 

mmmmmmmmmmmmmmmmmmmmmmm*mmmm**mmmmmmm**m*mmmmmmmmmmmm*mmmm*mm 
mm 

♦/ 

void 

tcIassInsertChild(TCLASS *parcnt, TCLASS *tclass) 
{ 

TCLASS ♦sibling, *last; 
unsigned long old, dummy; 

sibling = parent->child; 
last = NULL; 

LOCK_TASK(&old); 

while(sibling != NULL && tclassLessSpecific(tclass, sibling)) { 
last = sibling; 
sibling = sibling->sibling; 

} 

tclass->sibling = sibling; 
tclass->parent = parent; 

if (last = NULL) 

parent->child = tclass; 

else 

last*>sibling - tclass; 
UNLOCK_TASK(old,&dummy); 
tcfgConfigChanged = TRUE; 

> 

/mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm^mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm 
mm 
m 

♦ tclassLessSpecific(a, b) 

♦ Return TRUE if a is less match specific than b, that is if a should 

♦ be ordered AFTER b... 

mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm*mm*mmmmmmmmmmmmmmmmmmmmmmmmmmmmmmm 

I 

staticf int 
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tclassLessSpecific(TCLASS *a, TCLASS ♦b) 
{ 

int rc; 

if (IS_FLAG_SET(a->flags, TCF_EXCEPTION)){ 

if (! IS_FLAG_SET(b->flags, TCF_EXCEPTION)) 
retum(FALSE); 
} . 

else if (IS_FLAG_SET(b->flags, TCF_EXCEPTION)) 
return(TRUE); 

if aS_FL AG_SET(a->flags, TCF_INHERITABLE)) { 

if (! IS_FLAG_SET(b->flags, TCF_INHERITABLE)) 
retura(TRUE); 

} 

else if aS_FLAG_SET(b->flags, TCF_INHERITABLE)) 
retum(FALSE); 

/* 

* Ok - need to order by tspec IMS... 
♦/ 

if ((rc = tspecLessSpeciiic(a->imsTspecList, b->imsTspecList)) = TRUE 
&& tclassIsMatchA110>) && a->duld !- NULL) { 
if (b->child NNULL) 

rc — tclassLessSpecific(a->child, b->child); 
rc - FALSE; 

> 

/♦ 

* Ok - now order by policy/no policy... 
♦/ 

if (rc = TRUE && a->policy == NULL && b->policy != NULL) 

retum(FALSE); 
retum(rc); 

} 

* tspecLessSpecific(a, b) 

♦ Return TRUE if (a) less specific than (b). 

#define TSPEC_EQ(a,b) (memcmp((a), (b), sizeof(TSPEC)) = 0) 
int 

tspecLessSpecific(TSPECJTEM ♦a, TSPEC JTEM ♦b) 
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{ 

int abits, bbits; 

FLOW__SPEC ♦amask, *bmask, ♦af, *bf; 
whereStr("tspecLessSpecific"); 

if(tspecIsMatchAll(&a->tspec)) 

rctiim(TRUE); 
else if(tspecIsMatchAll(&b->tspec)) 

retum(FALSE); 

af = &a->tspec.fspec; 
bf = &b->tspec.fq)ec; 

if(af->protFamiiy != bf->protFamiiy) 

retum(af->protFamily > bf->protFainily); 

if(af->protFainily != PF_INET) 
retum(FALSE); 

/* 

* OK - Compare host masks... 
♦/ 

amask = &a->tspec.f5pecMask; 
bmask - &b->t^ec.fspecMask; 

abits = tspecHostBits(&amask->comi); 
bbits = tspecHostBits(&bmask->comi); 

if (abits !== bbits) retum(abits < bbits); 
/♦ 

♦ Compare port ranges... 
♦/ 

abits — tspecPortBits(&af->comi, &amask->comi); 
bbits = tspecPortBits(&bf->comi, &bmask->comi); 

if (abits != bbits) retum(abits > bbits); /♦ here more "bits" mean less 

specific ♦/ 

abits = (af->service[DIR.INBOUND] > SVC.CLIENT); 
bbits = (bf->service(piR_INBOUND] > SVC_CLIENT); 

if(abits != bbits) 

retiim(abits < bbits); 

abits « (af->serviceP>IR_OUTBOUND] > SVC^CLIENT); 
bbits - (bf->service[DIR_OUTBOUND] > SVC^CLIENT); 

if(abits != bbits) 
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retum(abits < bbits); 

/* 

* Compare precedence ranges 
♦/ 

abits = PREC(amask->tos) - PREC(af.>tos); 
bbits = PREC(bmask.>tos) - PREC(bf->tos); 

if(abits != bbits) 

retum(abits > bbits); 

/♦ 

* Finally - compare URL if it is present in either... 
*/ 

if (af->uri j| bf->uri) retum(urlLessSpecific(af->iiri, bf->uri)); 
/♦ 

* They are the same - true on a tie... 
*/ 

retum(TRUE); 

} 

♦ 

* urILessSpecific(a, b) 

*/ 

int 

urlLessSpecific(char *a, char *b) 
{ 

int aw, bw; 
char ♦cp; 

if(!a) 

retum(TRUE); 
else if (!b) 

retiim(FALSE); 

/* 

* OK - both have URL specified. Compare number of >viidcards... 
♦/ 

for (aw = 0, cp = a; *cp; cp-H-){ 
if (*cp = '*') aw4+; 

} 

for (bw = 0, cp = b; *cp; cp-H-){ 
if( cp = '**)bw++; 

} 
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retum(aw > bw ? TRUE : FALSE); 

} 



int 

hostBits(INT32 host) 
{ 

int bits, i; 

for (bits = 0, i = 31; i >= 0; i-){ 
if ((1 «i)&host) 
bitsH-+; 

else 

break; 

} 

retum(bits); 

} 

int 

portBits(INT16 port, INT16 range) 
{ 

if (port = 0) 

retum(0x00007fB); 
else if (range ~ 0) 
retum(l); 

else 

retum((range-port)+l); 

> 

int 

tspecHostBits(CONN *conn) 
{ 

retum(hostBits(conn->inside.host) + hostBits(conn->outside.host)); 

} 

int 

tspecPortBits(CX)NN ♦conn, CONN *niask) 
{ 

retum(portBits(conn->inside.port« mask->inside.port) + 

portBits(coiin->outside.port, madc->outside.poit)); 

} 

^ tclassValidateMembership(parent, tspec, verbose) 

^ Search recursively through parents to make sure we have a path from the 
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* root down to here. We ignore siblings and uncles, as specificity-order 
^ should take care of that 

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 

*/ 
mt 

tclassValidateMembership(TCLASS ♦parent, TSPEC ♦tspec, int verbose) 
{ 

VALIDATE_TCLASS(parent); 
/♦ 

♦ If we got to the top - it is valid... 
♦/ 

if (IS_FLAG_SET(parent->flags,TCF_GLOBAL)) retum(OK); 
/♦ 

♦ Is tspec a child of this parent.. 
♦/ 

if (! tspecIsChild(tspec, parent->imsTspecList)){ 

if (verbose) printf("\nNot valid member of traffic class %s", parent->name); 
retum(NOTOK); . 

} 

/♦ 

♦ OK - we fit here. Check at next parent level... 
♦/ 

retum(tclassValidateMembership(parent->parent» tspec, verbose)); 
} 



Speed Scaling - 

when a flow is allocated, the GIR and EIR are scaled per the 
detected (potential maximum) speed 

in this embodiment, the speed scaling is done by extrapolating 
from a defined high speed and low speed target rates. 



file: scale.c 
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♦ description: CIR and EIR scaling to load. 



automatically 



SCALFNG CIR and EIR to load potential: 

CIR scale is specified by user as two values, a base value 
(B) under which all speeds get 100%, and a limiting value (L) 
to cap CIR for high speed users. If a connections incoming 
speed is actually BETWEEN (B) and (L), then we create a 
scaled intermediate value for CIR 

So CIR^SPEC = {B,L} 

EIR scale is used to generate an EIR_VECTOR of allocation limits 
at each EIR priority level. EIR scale is specified as the 
highest priority level (HP), the scaled limit at that priority 
level (in terms of B and L, as with CIR). We then will 

generate the rest of the vector for the lower priority levels, up 
to an optional total limit (TL). 



So EIR^SPEC = {HP, HP-B, HP-L, TL}; 
♦ 09/18/97 Copyright Packeteer, Inc, 1996-97 @(#)scalex 



1.2 



♦/ 



* Rate Specifier, i.e. CIR + EIR... 
*/ 

typedef struct { 
BPS cirLimit; 
BPS eirDemand; 



EIR^VECTOR 
/♦ 

♦ UDP stuff tbd.. 
♦/ 

INT32 

} RATE.SPEC; 



eirDemandVector; 



bucketLimit; 



/♦ How much CIR I want */ 
/♦ total of eirDemandvector ♦/ 

/* How much EIR I want */ 



typedef struct _gearshift { 
DLINK dlink; 
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#define GF_INrnALIZED 

#define GF_SCALED 
#define GF_ACTrVE 
#define GF_EIR_OVERALLOCATED 

available*/ 

#define GF_CIR_OVERALLOCATED 
available*/ 

#defme GF_QUIESCENT 
#defme GF_PRIORITY 
#defme GF_RATE 
#defme GF_DQ 
#define GF_TCPRC 
#defme GF_FILTER 
#defme GF_NEEDS_REALLOC 
#define GF_CIR_COMMITTED 



INT32 flags; 

BPS target; 
INT32 cirlnUse; 
INT32 eirlnUse; 

BPS lastScaledBps; 



RATE_SPEC rspec; 
} GEARSHIFT; 



PCT/US97/225S0 

71 

0x0001 

0x0002 
0x0004 

0x0008 /* allocated though none 

0x0010 /* allocated though none 

0x0020 
0x0040 
0x0080 

0x0100 

0x0200 
0x0400 

0x0800 
0x1000 



/♦ 

* For scaling CIR to connection speed. See tiaffic/scalex. 

* If limit is set to 0, no CIR is allocated. 
*/ 

typedef struct { 

BPS base; 

BPS limit; 
} CIR_SPEC; 

/* 

* For scaling EIR to connection speed, i.e. oeating eirLimits vector. 

* hp -highest priority, hp-base, and hp-limit and optional total limit. 

* See trafBc/scale.c. 

* As vsrith CIR Spec, if hpLimit is set to 0, no EIR is allocated. 
•/ 

typedef struct { 
INT8 hp; 
BPS hpBase; 
BPS hpLimit; 
BPS totalLimit; 
} EIR_SPEC; 
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* Policy Item 
♦/ 

typedef struct ^policy { 
stract ^policy ♦next; 

struct _tclass *^tciass; 

INT32 flags; 
INT8 direction; 
INT% ' sessionTypc; 

INT32 hits; 
/* 

* For scaling EIR/CIR-.. 

*/ 

EIR^SPEC eirSpec; 

CIR_SPEC cirSpec; 

INT32 bucketLimit; 

/♦ 

* For Admission Control... 
*/ 

URI_.BUF ♦redirectUrl; 
/♦ 

* For Content Substitution*.. 
*/ 

SUB^SPEC *subSpec; 
} POLICY JTEM; 



* scaleToLoad(policy, tciass, gears, bps) 

* Set up EIR limit vector as well as appropriate CIR for this specific 

* connection speed (bps) for this GEAR (g). 

*/ 

void 

scaleToLoad(POLICYJTEM *pi, TCLASS ♦tciass, GEARSHIFT ♦g, SPEED bps) 
{ 

CIR_SPEC ♦cs = &pi->cirSpec; 
EIR_SPEC ♦es = &pi->eirSpec; 
BW_POOL ♦bwp = tclass->bwp; 
int eirDemand, eirLimit, eirExtra; 

g->lastScaledBps - bps; 
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/* 

* First set CIR,.. 
♦/ 

if (cs->limit > 0) { 

g->rspecxirLiniit = scaleRate(bps,cs->base,cs->limit, tclass->bwp->limit)- 

} 

else 

g->rspec.cirLiniit = g-'>cirInUse = 0; 
eirDemand = bps - g->rspec.cirLimit; 

eirVectorSet(g->rspec.eirDemandVector, 0); /* Zero vector... */ 
g->rspec.eirDemand = 0; 

if (es->hpLiniit > 0 && eirDemand > 0) { 
int cap; 
/♦ 

* Descending priority levels will get exponential chunks of 

* bandwidth, increasing in size as priority decreases. 
*/ 

eirLiinit = scaleRate(eirDeniand, es->hpBase, es->hpLimit, bwp->limit); 
/♦ 

* We allocate all of the rest (of BPS) at lower priority 

* levels unless a TOTAL LIMIT is set.. 
♦/ 

cap = bwp->limit; 

cap ~ g->rspec.cirLimit; 

if (es->totalLimit > 0 && es->totalLimit < cap) cap = es->totalLimit; 
/* 

* cap is now the largest amount of EIR we can have 
*/ 

eirExtra = min(eirDemand» cap) - eirLimit; 
if (eirExtra < 0) eirExtra = 0; 

/♦ 

* Assign limits to descending priority levels... 
♦/ 

(void)eirAssignDemand(es->hp, eirLimit, eirExtra, &g->rspec); 

} 

info4("pol %s: gear %d cir=%d, eir=%d", 

pi->tclass->name, g, g->rspec.cirLimit, g->rspec.eirDemand); 

} 
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* scaleRate(bps, base, limit, total) 



♦/ 

INT32 

scaleRate(INT32 bps, INT32 base, INT32 Hinit, INT32 total) 

INT32 pet; 
/♦ 

* If detected rate is lower than scale-base, just return base. 
♦/ 

if (bps <= base) { 
bps «= 2; 

if (bps > base) bps = base; 
return bps; 

} 

/* 

* Constrain total... 
♦/ 

if (total > limit « 1) total = limit « 1; 
/♦ 

^ If detected rate higher than total, just return limit 
♦/ 

if (bps >= total) 

return limit; 

/♦ 

* OK - we will allocate base + some percentage of (limit - base), 

* i.e. up to limit. Since we know bps < total, this will always 

* return a number less than or equal to lunit 
*/ 

pet = (bps « 7)/ total; 

return base + (((limit - base) * pet) » 7); 

} 

♦ eirAssignDemand(hp, hpLimit, eirExtra, rspec) 
staticf void 

eirAssignDemand(int hp, int eirLimit, int eirExtra, RATE_SPEC *r) 
{ 

BPS ♦b = &r->eirDemandVector[hp]; 
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int i, active; 

lot extra, extraChunk; 

int left; 

extra = max(0, (eirExtra - (eirLimit*(hp+l)))); 
left = eirExtra + eirLimit; 

r->cirDeinand = 0; 

active = 0; 

/• 

* NOTE: Eir demand has been set to 0... 
•/ 

for (i = hp; i >= 0; i-, b~) { 

if Geft < EIR_MIN_CHUNK &<& active != 0) 
break; 

active =1; 

if(i = 0) 

*b = left; 

else { 

*b = minOeft, eirLimit); 

if (extra > 0 && i < hp) { 

«ctraChunk ■ extra » (i + 1); 
*b += extraChunl^ 

} 

left — *b; 

> 

r->eirDemand += *b; 

> 

} 
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1. A method for allocating bandwidth in a packet telecommunicatioh system 
having any number of flows, including zero, using a classification paradigm comprising 
the steps of: 

parsing a packet into a flow specification, wherein said flow specification 
contains one or more instances of any of the following: 
a protocol family designation, 
a direction of packet flow designation, 
a protocol type designation, 
a plurality of hosts, 
a plurality of ports, 

in http protocol packets, a pointer to a URL; thereupon, 
matching the flow specification of the parsing step to a plurality of 

hierarchically-recognized classes represented by a plurality of nodes, each node having a 

traffic specification and a mask, according to the mask; thereupon, 

having found a matching node in the matching step, associating said flow 

specification with one of said plurality of hierarchically-recognized classes represented by 

a plurality nodes; and 

allocating bandwidth resources according to a policy associated with said class. 

2. The method of claim 1 wherein a node may have one or many traffic 
specifications. 

3. The method of claim 1 wherein said pluraliQr of hierarchically-recognized 
classes represents a plurality of orthogonal flow attributes. 

4. The method of claim 1 wherein allocating bandwidth resources further 
comprises the steps of: 

for rate based policies having a guaranteed information rslte (GIR) and an excess 
information rate (EIR), allocating GIR until no further GIR is available; thereupon, 
allocatmg EIR, 
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for priority based policies, having a priority, allocating unreserved service 
according to said priority. 

5. The method of claim 1 wherein said allocating bandwidth resources 
further comprises the step of: 

allocating a combination of GIR resources, EIR resources and unreserved 
resources, wherein GIR resources and EIR resources are allocated based upon speed 
scaling and unreserved resources are allocated based upon a priority. 

6. The method of claim 5 wherein allocating guaranteed information rate 
(GIR) resources and excess information rate (EIR) resources to said flow based upon 
speed scaling further comprises the steps of: 

detecting the speed of a network link upon which said flow is transported; 
allocatmg available bandwidth resources based upon the speed detected. 

7. The method of claim 5 wherein allocating unreserved resources to a 
particular flow, said flow having an input rate, said flow being one of said plurality of 
flows, based upon a priority further comprises the steps of: 

extrapolating from said input rate of the detecting step an extrapolated 
unreserved demand (EUD); 

aggregating together a plurality of individual flow demands to form an aggregate 
rate demand (ARD); 

combining said EUD and said ARD to form an instantaneous total demand; 

allocating available bandwidth resources based upon the instantaneous total 
demand determined in the combining step from highest priority demand first to lower 
priority demanded to yield a satisfaction percentage of demand; 

determining a target rate for individual flows by multiplying said individual flow 
demand for each flow by said satisfaction percentage of demand determined in the 
allocating step. 

8. The method of claim 7 wherein said unreserved resources are comprised 
of a plurality of individual inputs, each input having a priority, further comprising the 
step of: 
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determining a target rate for said units according to said priority by multiplying 
said extrapolated unreserved demand by said satisfaction percentage of demand 
determined in the allocating step. 

9. The method of claim 1 further comprising the step of determining that if 
guaranteed information rate resources are insufficient to accommodate said flow, 
invoking an admissions policy. 

10. In a packet communication environment allocated into layers, including at 
least a transport layer, link layer and an application layer, a method for classifying 
Internet Protocol (IP) flows at the transport layer for use in determining a policy, or rule 
of assignment of a service level, and enforcing that policy by direct rate control, said 
method comprising: 

applying individual instances of traffic classes to a classification tree based on 
selectable information obtained from an application layer of a conununication protocol; 
and thereupon, 

mapping the IP flow to the defined traffic classes, said traffic classes having an 
associated policy. 

11. A method for managing bandwidth on Internet Protocol (IP) flows in a 
packet conununication environment allocated into layers, including at least a transport 
layer, a link layer and an application layer, said method comprising: 

automatically detecting selectable information about said flow; 
determining a policy for assigning a service level to said flow based upon said 
automatically detected selectable information about said flow; and 

in9>lementing said policy by direct rate control of said flow. 

12. The method of claim 11 wherein said determining step further comprises 
applying individual instances of traffic class specifications, each trafBc class specification 
having an associated policy, to said flow based on selectable information; thereupon, 

associating with said flow the particular policy associated with said traffic class. 
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13. The method of claim 12 wherein said traffic classes have a hierarchical 
relationship to one another. 

14. A method for allocating network bandwidth, which bandwidth comprises 
guaranteed information rate (GIR) resources, excess information rate (EIR) resources and 
unreserved rate resources, on Internet Protocol (IP) flows in a packet communication 
environment allocated into layers, including at least a transport layer, a link layer and an 
application layer, said method comprising: 

automatically detecting selectable information about said flow; 

determining a policy for assigning a service level to said flow based upon said 
automatically detected selectable information about said flow, said policy directing either 
reserved service or unreserved service; ^ 

allocating bandwidth representing a combination of GIR resources, EIR resources 
and unreserved resources to a pluraUty of flows assigned said service level by said 
determining step. 

15. The method of claim 14 wherein said unreserved resources are comprised 
of a pluraliQr of individual inputs, each iiq)ut having a priori^, further comprising the 
step of: 

determining a target rate for said units according to said priority by multiplying 
said extrapolated unreserved demand by said satisfaction percentage of demand 
determined in the allocating step. 

16. The method of claim 14 wherein said allocating bandwidth representing a 
combination of GIR resources, EIR resources and unreserved resources comprises 
allocating GIR resources and EIR resources based upon a speed scaling and allocating 
unreserved resources based upon a priority. 

17. The method of claim 16 wherein allocating imreserved resources based 
upon a priority to a plurality of flows, each having an input rate, further comprises the 
steps of: 

detecting the ii^ut rate of a flow; 
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5 extrapolating from said input rate of the detecting step an extrapolated 

6 unreserved demand (EUD); 

7 aggregating together a plurality of individual flow demands to form an aggregate 

8 rate demand (ARD); 

9 combining said EUD and said ARD to form an instantaneous total demand; 

10 allocating available bandwidth resources based upon the instantaneous total 

11 demand determined in the combming step from highest priority demand first to lower 

12 priority demanded to yield a satisfaction percentage of demand; 

13 determining a target rate for individual flows by multiplying said individual flow 

14 demand for each flow by said satisfaction percentage of demand determined in the 

15 allocating step. 

16 18. The method of claim 17 wherein said unreserved resources are comprised 

17 of a plurality of individual inputs, each input having a priority, further comprising the 

18 step of: 

19 determining a target rate for said imits according to said priority by multiplying 

20 said extrapolated unreserved demand by said satisfaction percentage of demand 

21 determined in the allocating step. 

1 19. The method of claim 14 further comprising the step of determining that if 

2 guaranteed bandwidth available is insufficient to accommodate said flow, invoking an 
admissions policy. 

1 20. The method of claim 14 wherein said determining step further comprises 

2 applying individual instances of traffic classes, each traffic class having an associated 

3 policy, to said flow based on selectable information; thereupon, 

associating with said flow the particular policy associated with said traffic class. 



1 



21. The method of claim 19 wherein said traffic classes have a hierarchical 
relationship to one another. 
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1 22. The method of claim 14 farther cx)mprising allocating said bandwidth 

2 representing unreserved data resources among flows based upon a priority associated 

3 with said flows, 

4 23, The method of claim 14 farther comprising applying a partition to one or 

5 more classes, said partition having a limit on aggregate data rate flow. 

1 24. The method of claim 23 wherein said lunit is a soft partition limiting FIR 

2 for one or more classes. 
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